msm-4.14

mirror of https://github.com/rd-stuffs/msm-4.14.git synced 2025-02-20 11:45:48 +08:00

Author	SHA1	Message	Date
Nick Alcock	7de5fa111a	UPSTREAM: unicode: remove MODULE_LICENSE in non-modules Since commit 8b41fc4454e ("kbuild: create modules.builtin without Makefile.modbuiltin or tristate.conf"), MODULE_LICENSE declarations are used to identify modules. As a consequence, uses of the macro in non-modules will cause modprobe to misidentify their containing object file as a module when it is not (false positives), and modprobe might succeed rather than failing with a suitable error message. So remove it in the files in this commit, none of which can be built as modules. Signed-off-by: Nick Alcock <nick.alcock@oracle.com> Suggested-by: Luis Chamberlain <mcgrof@kernel.org> Acked-by: Gabriel Krisman Bertazi <krisman@suse.de> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: linux-modules@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Hitomi Hasegawa <hasegawa-hitomi@fujitsu.com> Cc: Gabriel Krisman Bertazi <krisman@collabora.com> Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: mrsrimar22 <mar.pashter1922@gmail.com> Signed-off-by: chrisl7 <wandersonrodriguesf1@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-11 19:27:21 -03:00
Christoph Hellwig	3764e06c6d	UPSTREAM: unicode: clean up the Kconfig symbol confusion Turn the CONFIG_UNICODE symbol into a tristate that generates some always built in code and remove the confusing CONFIG_UNICODE_UTF8_DATA symbol. Note that a lot of the IS_ENABLED() checks could be turned from cpp statements into normal ifs, but this change is intended to be fairly mechanic, so that should be cleaned up later. Fixes: 2b3d04787012 ("unicode: Add utf8-data module") Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Eric Biggers <ebiggers@google.com> Change-Id: I91c9031a7320e996b1cd931a18d79bfab05ee959 Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: mrsrimar22 <mar.pashter1922@gmail.com> Signed-off-by: chrisl7 <wandersonrodriguesf1@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-11 19:25:23 -03:00
Krzysztof Wilczynski	832e98b3c6	UPSTREAM: unicode: Move static keyword to the front of declarations Move the static keyword to the front of declarations of nfdi_test_data and nfdicf_test_data, and resolve the following compiler warnings that can be seen when building with warnings enabled (W=1): fs/unicode/utf8-selftest.c:38:1: warning: ‘static’ is not at beginning of declaration [-Wold-style-declaration] fs/unicode/utf8-selftest.c:92:1: warning: ‘static’ is not at beginning of declaration [-Wold-style-declaration] Signed-off-by: Krzysztof Wilczynski <kw@linux.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: mrsrimar22 <mar.pashter1922@gmail.com> Signed-off-by: chrisl7 <wandersonrodriguesf1@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-11 18:38:22 -03:00
Linus Torvalds	c4db30fcd3	UPSTREAM: unicode: fix .gitignore for generated utfdata file Commit 2b3d04787012 ("unicode: Add utf8-data module") changed the generated utf8data file from 'utf8data.h' to 'utf8data.c', but didn't change the comments or the .gitignore to match. The comments should be updated too, but at least they don't cause any visible breakage. But the gitignore file needs changing to avoid git complaining about untracked files. Fixes: 2b3d04787012 ("unicode: Add utf8-data module") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: mrsrimar22 <mar.pashter1922@gmail.com> Signed-off-by: chrisl7 <wandersonrodriguesf1@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-11 18:38:21 -03:00
Christoph Hellwig	adbec9dbf3	UPSTREAM: unicode: only export internal symbols for the selftests The exported symbols in utf8-norm.c are not needed for normal file system consumers, so move them to conditional _GPL exports just for the selftest. Change-Id: Ie9170398b992e5b4ece631ba302e3e8feb8b5ef7 Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: chrisl7 <wandersonrodriguesf1@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-06-15 15:31:08 -03:00
Christoph Hellwig	cbc05b0350	UPSTREAM: unicode: Add utf8-data module utf8data.h contains a large database table which is an auto-generated decodification trie for the unicode normalization functions. Allow building it into a separate module. Based on a patch from Shreeya Patel <shreeya.patel@collabora.com>. Change-Id: I6ce97a0249b1fedbd27b3f05d73459f5a02d3e75 Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: chrisl7 <wandersonrodriguesf1@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-06-15 15:31:08 -03:00
Christoph Hellwig	89b5457241	UPSTREAM: unicode: cache the normalization tables in struct unicode_map Instead of repeatedly looking up the version add pointers to the NFD and NFD+CF tables to struct unicode_map, and pass a unicode_map plus index to the functions using the normalization tables. Change-Id: Iced7cfaa61057defd15581258a8d1b093f041759 Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: chrisl7 <wandersonrodriguesf1@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-06-15 15:31:08 -03:00
Christoph Hellwig	1da79aa576	UPSTREAM: unicode: move utf8cursor to utf8-selftest.c Only used by the tests, so no need to keep it in the core. Change-Id: Id1f83cc3575f2683e7020268e34336b68eb55318 Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: chrisl7 <wandersonrodriguesf1@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-06-15 15:31:08 -03:00
Christoph Hellwig	02f80a8bf8	UPSTREAM: unicode: simplify utf8len Just use the utf8nlen implementation with a (size_t)-1 len argument, similar to utf8_lookup. Also move the function to utf8-selftest.c, as it isn't used anywhere else. Change-Id: I6a5b7f73de9493ed76a2aa884db1cc6ee99655d3 Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: chrisl7 <wandersonrodriguesf1@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-06-15 15:31:08 -03:00
Christoph Hellwig	12489ae4c6	UPSTREAM: unicode: remove the unused utf8{,n}age{min,max} functions No actually used anywhere. Change-Id: Ia319ea78385ec20c96036e83aac378638429e6cd Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: chrisl7 <wandersonrodriguesf1@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-06-15 15:31:08 -03:00
Christoph Hellwig	343d1e862f	UPSTREAM: unicode: pass a UNICODE_AGE() tripple to utf8_load Don't bother with pointless string parsing when the caller can just pass the version in the format that the core expects. Also remove the fallback to the latest version that none of the callers actually uses. Change-Id: I4c0056bcdce44ea655d5da0f69f92f7a1a590445 Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: chrisl7 <wandersonrodriguesf1@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-06-15 15:31:08 -03:00
Christoph Hellwig	f0006a3c97	UPSTREAM: unicode: remove the charset field from struct unicode_map It is hardcoded and only used for a f2fs sysfs file where it can be hardcoded just as easily. Change-Id: I4a0e2e143a62c411dbcff89a2398ed72e8125500 Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: chrisl7 <wandersonrodriguesf1@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-06-15 15:31:08 -03:00
Daniel Rosenberg	afff68f6cf	unicode: Add utf8_casefold_hash This adds a case insensitive hash function to allow taking the hash without needing to allocate a casefolded copy of the string. The existing d_hash implementations for casefolding allocate memory within rcu-walk, by avoiding it we can be more efficient and avoid worrying about a failed allocation. Signed-off-by: Daniel Rosenberg <drosen@google.com> Reviewed-by: Gabriel Krisman Bertazi <krisman@collabora.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2020-09-11 11:23:08 -07:00
Gabriel Krisman Bertazi	0abff11d2e	ext4: optimize case-insensitive lookups Temporarily cache a casefolded version of the file name under lookup in ext4_filename, to avoid repeatedly casefolding it. I got up to 30% speedup on lookups of large directories (>100k entries), depending on the length of the string under lookup. Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2019-09-23 19:08:46 -07:00
Theodore Ts'o	7db4acd88a	unicode: update to Unicode 12.1.0 final Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: Gabriel Krisman Bertazi <krisman@collabora.com>	2019-09-23 19:08:44 -07:00
Theodore Ts'o	1fc69d99b2	unicode: add missing check for an error return from utf8lookup() Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: Gabriel Krisman Bertazi <krisman@collabora.com>	2019-09-23 19:08:43 -07:00
Masahiro Yamada	a01d5c6fe9	unicode: refactor the rule for regenerating utf8data.h scripts/mkutf8data is used only when regenerating utf8data.h, which never happens in the normal kernel build. However, it is irrespectively built if CONFIG_UNICODE is enabled. Moreover, there is no good reason for it to reside in the scripts/ directory since it is only used in fs/unicode/. Hence, move it from scripts/ to fs/unicode/. In some cases, we bypass build artifacts in the normal build. The conventional way to do so is to surround the code with ifdef REGENERATE_. For example, - 7373f4f83c71 ("kbuild: add implicit rules for parser generation") - 6aaf49b495b4 ("crypto: arm,arm64 - Fix random regeneration of S_shipped") I rewrote the rule in a more kbuild'ish style. In the normal build, utf8data.h is just shipped from the check-in file. $ make [ snip ] SHIPPED fs/unicode/utf8data.h CC fs/unicode/utf8-norm.o CC fs/unicode/utf8-core.o CC fs/unicode/utf8-selftest.o AR fs/unicode/built-in.a If you want to generate utf8data.h based on UCD, put .txt files into fs/unicode/, then pass REGENERATE_UTF8DATA=1 from the command line. The mkutf8data tool will be automatically compiled to generate the utf8data.h from the *.txt files. $ make REGENERATE_UTF8DATA=1 [ snip ] HOSTCC fs/unicode/mkutf8data GEN fs/unicode/utf8data.h CC fs/unicode/utf8-norm.o CC fs/unicode/utf8-core.o CC fs/unicode/utf8-selftest.o AR fs/unicode/built-in.a I renamed the check-in utf8data.h to utf8data.h_shipped so that this will work for the out-of-tree build. You can update it based on the latest UCD like this: $ make REGENERATE_UTF8DATA=1 fs/unicode/ $ cp fs/unicode/utf8data.h fs/unicode/utf8data.h_shipped Also, I added entries to .gitignore and dontdiff. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2019-09-23 19:08:41 -07:00
Gabriel Krisman Bertazi	b1bda14bfb	unicode: update unicode database unicode version 12.1.0 Regenerate utf8data.h based on the latest UCD files and run tests against the latest version. Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2019-09-23 19:07:21 -07:00
Gabriel Krisman Bertazi	2c2236abc9	unicode: introduce test module for normalized utf8 implementation This implements a in-kernel sanity test module for the utf8 normalization core. At probe time, it will run basic sequences through the utf8n core, to identify problems will equivalent sequences and normalization/casefold code. This is supposed to be useful for regression testing when adding support for a new version of utf8 to linux. Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2019-09-23 19:07:20 -07:00
Gabriel Krisman Bertazi	c71834b718	unicode: implement higher level API for string handling This patch integrates the utf8n patches with some higher level API to perform UTF-8 string comparison, normalization and casefolding operations. Implemented is a variation of NFD, and casefold is performed by doing full casefold on top of NFD. These algorithms are based on the core implemented by Olaf Weber from SGI. Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2019-09-23 19:07:18 -07:00
Olaf Weber	74fc5fb3da	unicode: reduce the size of utf8data[] Remove the Hangul decompositions from the utf8data trie, and do algorithmic decomposition to calculate them on the fly. To store the decomposition the caller of utf8lookup()/utf8nlookup() must provide a 12-byte buffer, which is used to synthesize a leaf with the decomposition. This significantly reduces the size of the utf8data[] array. Changes made by Gabriel: Rebase to mainline Fix checkpatch errors Extract robustness fixes and merge back to original mkutf8data.c patch Regenerate utf8data.h Signed-off-by: Olaf Weber <olaf@sgi.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2019-09-23 19:07:17 -07:00
Olaf Weber	e58b26a26f	unicode: introduce code for UTF-8 normalization Supporting functions for UTF-8 normalization are in utf8norm.c with the header utf8norm.h. Two normalization forms are supported: nfdi and nfdicf. nfdi: - Apply unicode normalization form NFD. - Remove any Default_Ignorable_Code_Point. nfdicf: - Apply unicode normalization form NFD. - Remove any Default_Ignorable_Code_Point. - Apply a full casefold (C + F). For the purposes of the code, a string is valid UTF-8 if: - The values encoded are 0x1..0x10FFFF. - The surrogate codepoints 0xD800..0xDFFFF are not encoded. - The shortest possible encoding is used for all values. The supporting functions work on null-terminated strings (utf8 prefix) and on length-limited strings (utf8n prefix). From the original SGI patch and for conformity with coding standards, the utf8data_t typedef was dropped, since it was just masking the struct keyword. On other occasions, namely utf8leaf_t and utf8trie_t, I decided to keep it, since they are simple pointers to memory buffers, and using uchars here wouldn't provide any more meaningful information. From the original submission, we also converted from the compatibility form to canonical. Changes made by Gabriel: Rebase to Mainline Fix up checkpatch.pl warnings Drop typedefs move out of libxfs Convert from NFKD to NFD Signed-off-by: Olaf Weber <olaf@sgi.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2019-09-23 19:07:16 -07:00
Gabriel Krisman Bertazi	5c0d8663cf	unicode: introduce UTF-8 character database The decomposition and casefolding of UTF-8 characters are described in a prefix tree in utf8data.h, which is a generate from the Unicode Character Database (UCD), published by the Unicode Consortium, and should not be edited by hand. The structures in utf8data.h are meant to be used for lookup operations by the unicode subsystem, when decoding a utf-8 string. mkutf8data.c is the source for a program that generates utf8data.h. It was written by Olaf Weber from SGI and originally proposed to be merged into Linux in 2014. The original proposal performed the compatibility decomposition, NFKD, but the current version was modified by me to do canonical decomposition, NFD, as suggested by the community. The changes from the original submission are: * Rebase to mainline. * Fix out-of-tree-build. * Update makefile to build 11.0.0 ucd files. * drop references to xfs. * Convert NFKD to NFD. * Merge back robustness fixes from original patch. Requested by Dave Chinner. The original submission is archived at: <https://linux-xfs.oss.sgi.narkive.com/Xx10wjVY/rfc-unicode-utf-8-support-for-xfs> The utf8data.h file can be regenerated using the instructions in fs/unicode/README.utf8data. - Notes on the update from 8.0.0 to 11.0: The structure of the ucd files and special cases have not experienced any changes between versions 8.0.0 and 11.0.0. 8.0.0 saw the addition of Cherokee LC characters, which is an interesting case for case-folding. The update is accompanied by new tests on the test_ucd module to catch specific cases. No changes to mkutf8data script were required for the updates. Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2019-09-23 19:07:15 -07:00

23 Commits