diff --git a/CHANGELOG.md b/CHANGELOG.md index 420701b2f..fbcb6abb0 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -14,11 +14,15 @@ _None_ ### Bug Fixes -_None_ +- `StringsFileValidationHelper.find_duplicated_keys` now parses *unquoted* keys and values (valid `.strings` syntax, common in `InfoPlist.strings`) instead of raising `Invalid character`, matching the character set `plutil` accepts for unquoted strings (alphanumerics plus `_ . - $ : /`). This lets `ios_lint_localizations`' `check_duplicate_keys` work on `InfoPlist.strings`-style files. [#741] +- `StringsFileValidationHelper.find_duplicated_keys` now also accepts comments placed *between* the tokens of a statement (e.g. `"key" /* note */ = "value";`), which `plutil` allows, instead of raising `Invalid character` on the `/`. [#741] +- `ios_lint_localizations`' `check_duplicate_keys` no longer crashes the lane on a file that parses as a property list but isn't a flat `.strings` (e.g. a nested-dictionary value); it now warns via `UI.important` and skips that file. [#741] +- `L10nHelper.merge_strings` now applies the key prefix to unquoted keys containing `. - $ : /` and to lines with an *unquoted* value (e.g. `CFBundleName = WordPress;`), matching the unquoted-string grammar `plutil` accepts. Previously those keys were written to the merged file without the prefix while still being bookkept *with* it, leaving the output inconsistent with the reported keys (which could resurface the very collisions the prefix avoids and break downstream key extraction). [#741] ### Internal Changes -_None_ +- Centralized `.strings` duplicate-key detection behind `StringsFileValidationHelper.scan_for_duplicate_keys`, which detects the file format and returns a `[:scanned | :unsupported_format | :unscannable, payload]` tri-state that callers map to their own policy (warn-and-skip vs fail-closed). `ios_lint_localizations` now uses it. [#741] +- `L10nHelper.strings_file_type` (and `StringsFileValidationHelper.scan_for_duplicate_keys`) accept `assume_valid:` to skip a redundant `plutil -lint` when the caller has already parsed the file. [#741] ## 14.7.0 diff --git a/lib/fastlane/plugin/wpmreleasetoolkit/actions/ios/ios_lint_localizations.rb b/lib/fastlane/plugin/wpmreleasetoolkit/actions/ios/ios_lint_localizations.rb index 42b051d55..155baea5c 100644 --- a/lib/fastlane/plugin/wpmreleasetoolkit/actions/ios/ios_lint_localizations.rb +++ b/lib/fastlane/plugin/wpmreleasetoolkit/actions/ios/ios_lint_localizations.rb @@ -56,15 +56,20 @@ def self.find_duplicated_keys(params) language = File.basename(File.dirname(file), '.lproj') path = File.join(params[:input_dir], file) - file_type = Fastlane::Helper::Ios::L10nHelper.strings_file_type(path: path) - if file_type == :text - duplicates = Fastlane::Helper::Ios::StringsFileValidationHelper.find_duplicated_keys(file: path) - duplicate_keys[language] = duplicates.map { |key, value| "`#{key}` was found at multiple lines: #{value.join(', ')}" } unless duplicates.empty? - else + status, payload = Fastlane::Helper::Ios::StringsFileValidationHelper.scan_for_duplicate_keys(file: path) + case status + when :scanned + duplicate_keys[language] = payload.map { |key, value| "`#{key}` was found at multiple lines: #{value.join(', ')}" } unless payload.empty? + when :unsupported_format UI.important <<~WRONG_FORMAT - File `#{path}` is in #{file_type} format, while finding duplicate keys only make sense on files that are in ASCII-plist format. - Since your files are in #{file_type} format, you should probably disable the `check_duplicate_keys` option from this `#{action_name}` call. + File `#{path}` is in #{payload} format, while finding duplicate keys only make sense on files that are in ASCII-plist format. + Since your files are in #{payload} format, you should probably disable the `check_duplicate_keys` option from this `#{action_name}` call. WRONG_FORMAT + when :unscannable + UI.important <<~UNSCANNABLE + Could not check `#{path}` for duplicate keys: #{payload.strip} + The file parses as a property list but isn't a flat `.strings` file the duplicate-key scanner understands, so duplicate detection was skipped for it. + UNSCANNABLE end end diff --git a/lib/fastlane/plugin/wpmreleasetoolkit/helper/ios/ios_l10n_helper.rb b/lib/fastlane/plugin/wpmreleasetoolkit/helper/ios/ios_l10n_helper.rb index 26ed9ef66..59f8daa7d 100644 --- a/lib/fastlane/plugin/wpmreleasetoolkit/helper/ios/ios_l10n_helper.rb +++ b/lib/fastlane/plugin/wpmreleasetoolkit/helper/ios/ios_l10n_helper.rb @@ -15,18 +15,24 @@ class L10nHelper # Returns the type of a `.strings` file (XML, binary or ASCII) # # @param [String] path The path to the `.strings` file to check + # @param [Boolean] assume_valid Skip the `plutil -lint` validity check when the caller has already + # confirmed the file parses (e.g. via `read_strings_file_as_hash`), avoiding a redundant + # `plutil` invocation. Only the format detection (`file`) then runs. # @return [Symbol] The file format used by the `.strings` file. Can be one of: # - `:text` for the ASCII-plist file format (containing typical `"key" = "value";` lines) # - `:xml` for XML plist file format (can be used if machine-generated, especially since there's no official way/tool to generate the ASCII-plist file format as output) # - `:binary` for binary plist file format (usually only true for `.strings` files converted by Xcode at compile time and included in the final `.app`/`.ipa`) # - `nil` if the file does not exist or is neither of those format (e.g. not a `.strings` file at all) # - def self.strings_file_type(path:) + def self.strings_file_type(path:, assume_valid: false) return :text if File.empty?(path) # If completely empty file, consider it as a valid `.strings` files in textual format - # Start by checking it seems like a valid property-list file (and not e.g. an image or plain text file) - _, status = Open3.capture2('/usr/bin/plutil', '-lint', path) - return nil unless status.success? + # Start by checking it seems like a valid property-list file (and not e.g. an image or plain text file). + # A caller that has already parsed the file can skip this redundant check via `assume_valid: true`. + unless assume_valid + _, status = Open3.capture2('/usr/bin/plutil', '-lint', path) + return nil unless status.success? + end # If it is a valid property-list file, determine the actual format used format_desc, status = Open3.capture2('/usr/bin/file', path) @@ -96,9 +102,17 @@ def self.merge_strings(paths:, output_path:) # Read line-by-line to reduce memory footprint during content copy read_utf8_lines(input_file).each do |line| unless prefix.nil? || prefix.empty? - # The `/u` modifier on the RegExps is to make them UTF-8 + # The `/u` modifier on the RegExps is to make them UTF-8. + # The unquoted-key character set below matches what `plutil` accepts (alphanumerics plus + # `_ . - $ : /`, as in `StringsFileValidationHelper::UNQUOTED_STRING_CHARACTER`) so that keys + # containing `. - $ : /` (e.g. reverse-DNS keys) are prefixed like any other. line.gsub!(/^(\s*")/u, "\\1#{prefix}") # Lines starting with a quote are considered to be start of a key; add prefix right after the quote - line.gsub!(/^(\s*)([A-Z0-9_]+)(\s*=\s*")/ui, "\\1\"#{prefix}\\2\"\\3") # Lines starting with an identifier followed by a '=' are considered to be an unquoted key (typical in InfoPlist.strings files for example) + # Lines starting with an unquoted key followed by a `=` and a *quoted* value (typical in `InfoPlist.strings`). + line.gsub!(%r{^(\s*)([a-zA-Z0-9_.$:/-]+)(\s*=\s*")}u, "\\1\"#{prefix}\\2\"\\3") + # Lines starting with an unquoted key followed by a `=` and an *unquoted* value (e.g. `CFBundleName = WordPress;`). + # The value is required to be a single unquoted token ending in `;` — an unquoted string can't contain spaces, so + # this won't mistake comment prose like `and = a red herring for when…` (spaces in the "value") for a key/value pair. + line.gsub!(%r{^(\s*)([a-zA-Z0-9_.$:/-]+)(\s*=\s*)([a-zA-Z0-9_.$:/-]+\s*;)}u, "\\1\"#{prefix}\\2\"\\3\\4") end tmp_file.write(line) end diff --git a/lib/fastlane/plugin/wpmreleasetoolkit/helper/ios/ios_strings_file_validation_helper.rb b/lib/fastlane/plugin/wpmreleasetoolkit/helper/ios/ios_strings_file_validation_helper.rb index 890b8138a..c16f4818c 100644 --- a/lib/fastlane/plugin/wpmreleasetoolkit/helper/ios/ios_strings_file_validation_helper.rb +++ b/lib/fastlane/plugin/wpmreleasetoolkit/helper/ios/ios_strings_file_validation_helper.rb @@ -5,24 +5,75 @@ module Helper module Ios class StringsFileValidationHelper # context can be one of: - # :root, :maybe_comment_start, :in_line_comment, :in_block_comment, - # :maybe_block_comment_end, :in_quoted_key, - # :after_quoted_key_before_eq, :after_quoted_key_and_equal, - # :in_quoted_value, :after_quoted_value - State = Struct.new(:context, :buffer, :in_escaped_ctx, :found_key) + # :root, :maybe_comment_start, :maybe_comment_or_value, :in_line_comment, :in_block_comment, + # :maybe_block_comment_end, :in_quoted_key, :in_unquoted_key, + # :after_quoted_key_before_eq, :after_quoted_key_and_eq, + # :in_quoted_value, :in_unquoted_value, :after_quoted_value + # + # `resume_context` holds the context to return to once a comment ends. Comments are valid not only at + # the top level but also *between* the tokens of a statement (e.g. `"key" /* note */ = "value";`), so a + # comment must resume the state it interrupted rather than always dropping back to `:root`. + State = Struct.new(:context, :buffer, :in_escaped_ctx, :found_key, :resume_context) + + # Characters allowed in an *unquoted* string — a key or a value. Unquoted strings are valid + # `.strings` syntax (the old-style ASCII property-list format) and are common in `InfoPlist.strings` + # (e.g. `CFBundleName = WordPress;`). `plutil` accepts alphanumerics plus `_ . - $ : /` in an + # unquoted string, so we match the same set: this scanner only ever runs on input `plutil` has + # already accepted, and matching its grammar keeps a file it parses from tripping the scanner. + # An unquoted key runs until the first whitespace or `=`; an unquoted value until whitespace or `;`. + UNQUOTED_STRING_CHARACTER = %r{[a-zA-Z0-9_.$:/-]}u + + # Enter a comment from an inter-token position, remembering where to resume once it ends. + # (`state.context` is still the originating state when a transition lambda runs — it's only reassigned + # to the lambda's return value afterwards — so this captures the state the comment interrupts.) + ENTER_COMMENT = lambda do |state, _c| + state.resume_context = state.context + :maybe_comment_start + end + + # A `/` between `=` and the value is ambiguous: it can start a comment (`/* … */` or `// …`) or be the + # first character of an unquoted value (e.g. a path like `/usr/bin`). Defer the decision by one character + # via `:maybe_comment_or_value`. + ENTER_COMMENT_OR_VALUE = lambda do |state, _c| + state.resume_context = state.context + :maybe_comment_or_value + end + + # Restore the context a comment interrupted (defaults to `:root`), then clear the saved context. + RESUME_AFTER_COMMENT = lambda do |state, _c| + resume = state.resume_context || :root + state.resume_context = :root + resume + end TRANSITIONS = { root: { /\s/u => :root, '/' => :maybe_comment_start, - '"' => :in_quoted_key + '"' => :in_quoted_key, + # An unquoted key, e.g. `CFBundleName = "…";` as used by `InfoPlist.strings`. + UNQUOTED_STRING_CHARACTER => lambda do |state, c| + state.buffer.write(c) + :in_unquoted_key + end }, maybe_comment_start: { '/' => :in_line_comment, /\*/u => :in_block_comment }, + # Reached only from `:after_quoted_key_and_eq`, where a leading `/` might begin a comment or an + # unquoted value. A `*` or `/` confirms a comment; anything else means the `/` was the value's first + # character (values aren't buffered, so we just continue scanning the value). + maybe_comment_or_value: { + /\*/u => :in_block_comment, + '/' => :in_line_comment, + /./mu => lambda do |state, _c| + state.resume_context = :root + :in_unquoted_value + end + }, in_line_comment: { - "\n" => :root, + "\n" => RESUME_AFTER_COMMENT, /./u => :in_line_comment }, in_block_comment: { @@ -30,7 +81,7 @@ class StringsFileValidationHelper /./mu => :in_block_comment }, maybe_block_comment_end: { - '/' => :root, + '/' => RESUME_AFTER_COMMENT, /./mu => :in_block_comment }, in_quoted_key: { @@ -44,19 +95,48 @@ class StringsFileValidationHelper :in_quoted_key end }, + in_unquoted_key: { + # The key ends at the first whitespace or `=`. Whitespace still expects an `=` next; + # an `=` moves straight on to the value. + /[\s=]/u => lambda do |state, c| + state.found_key = state.buffer.string.dup + state.buffer = StringIO.new + c == '=' ? :after_quoted_key_and_eq : :after_quoted_key_before_eq + end, + UNQUOTED_STRING_CHARACTER => lambda do |state, c| + state.buffer.write(c) + :in_unquoted_key + end + }, after_quoted_key_before_eq: { + # A comment may sit between the key and the `=` (e.g. `"key" /* note */ = "value";`). + '/' => ENTER_COMMENT, /\s/u => :after_quoted_key_before_eq, '=' => :after_quoted_key_and_eq }, after_quoted_key_and_eq: { + # A `/` here may start a comment or an unquoted value (which can contain `/`); disambiguate one char + # later. This entry must precede `UNQUOTED_STRING_CHARACTER` below, which also matches `/`. + '/' => ENTER_COMMENT_OR_VALUE, /\s/u => :after_quoted_key_and_eq, - '"' => :in_quoted_value + '"' => :in_quoted_value, + # An unquoted value, e.g. `CFBundleName = WordPress;` as used by `InfoPlist.strings`. + UNQUOTED_STRING_CHARACTER => :in_unquoted_value }, in_quoted_value: { '"' => :after_quoted_value, /./mu => :in_quoted_value }, + in_unquoted_value: { + # The value ends at the first whitespace or the terminating `;`. Its contents are irrelevant + # to duplicate-key detection, so — unlike a key — we don't buffer it. + ';' => :root, + /\s/u => :after_quoted_value, + UNQUOTED_STRING_CHARACTER => :in_unquoted_value + }, after_quoted_value: { + # A comment may sit between the value and the terminating `;` (e.g. `"key" = "value" /* note */;`). + '/' => ENTER_COMMENT, /\s/u => :after_quoted_value, ';' => :root } @@ -70,7 +150,7 @@ class StringsFileValidationHelper def self.find_duplicated_keys(file:) keys_with_lines = Hash.new { |h, k| h[k] = [] } - state = State.new(context: :root, buffer: StringIO.new, in_escaped_ctx: false, found_key: nil) + state = State.new(context: :root, buffer: StringIO.new, in_escaped_ctx: false, found_key: nil, resume_context: :root) # Using our `each_utf8_line` helper instead of `File.readlines` ensures we can also read files that are # encoded in UTF-16, yet process each of their lines as a UTF-8 string, so that `RegExp#match?` don't throw @@ -106,6 +186,28 @@ def self.find_duplicated_keys(file:) keys_with_lines.keep_if { |_, lines| lines.count > 1 } end + + # Detects the file format and, when applicable, scans for duplicate keys — in one step, so + # callers don't each re-implement the "`:text`-only" gate. `find_duplicated_keys` only + # understands the flat ASCII-plist syntax; an xml/binary plist can't be tokenized by it (though + # `plutil` collapses any duplicate to its last value when parsing those anyway). + # + # @param [String] file The path to the `.strings` file to inspect. + # @param [Boolean] assume_valid Forwarded to `strings_file_type`: skip the redundant `plutil -lint` + # when the caller has already confirmed the file parses. + # @return [Array] A `[status, payload]` pair, one of: + # - `[:scanned, { key => [lines] }]` — a `:text` file we tokenized (hash empty if none). + # - `[:unsupported_format, format]` — not a `:text` file (`:xml`, `:binary`, or `nil`); not scanned. + # - `[:unscannable, error_message]` — a `:text` file the tokenizer couldn't read. + # Each caller decides how to react (warn-and-skip, fail closed, …) from this one source of truth. + def self.scan_for_duplicate_keys(file:, assume_valid: false) + format = Fastlane::Helper::Ios::L10nHelper.strings_file_type(path: file, assume_valid: assume_valid) + return [:unsupported_format, format] unless format == :text + + [:scanned, find_duplicated_keys(file: file)] + rescue StandardError => e + [:unscannable, e.message] + end end end end diff --git a/spec/ios_l10n_helper_spec.rb b/spec/ios_l10n_helper_spec.rb index e9609240b..3d2d66ed4 100644 --- a/spec/ios_l10n_helper_spec.rb +++ b/spec/ios_l10n_helper_spec.rb @@ -39,6 +39,14 @@ def file_encoding(path) expect(File).to exist(invalid_fixture) expect(described_class.strings_file_type(path: invalid_fixture)).to be_nil end + + it 'skips the redundant `plutil -lint` check when `assume_valid: true`, still detecting the format' do + text_fixture = fixture('Localizable-utf16.strings') + allow(Open3).to receive(:capture2).and_call_original + expect(Open3).not_to receive(:capture2).with('/usr/bin/plutil', '-lint', anything) + + expect(described_class.strings_file_type(path: text_fixture, assume_valid: true)).to eq(:text) + end end describe '#merge_strings' do @@ -92,6 +100,28 @@ def file_encoding(path) end end + it 'prefixes unquoted keys with unquoted values and keys containing `. - $ : /`' do + # Regression: prefixing must cover the full unquoted-string grammar `plutil` accepts — unquoted *values* + # (e.g. `CFBundleName = WordPress;`) and keys containing `. - $ : /` — so the written file stays consistent + # with the prefixed keys we report. (Previously only `[A-Z0-9_]` keys with a *quoted* value were prefixed, + # leaving these written without the prefix and resurfacing the very collisions the prefix avoids.) + content = <<~STRINGS + CFBundleName = WordPress; + com.automattic.app-id = "X"; + "QuotedKey" = "Y"; + STRINGS + Dir.mktmpdir('a8c-release-toolkit-l10n-helper-tests-') do |tmp_dir| + input_file = File.join(tmp_dir, 'InfoPlist.strings') + File.write(input_file, content) + output_file = File.join(tmp_dir, 'output.strings') + described_class.merge_strings(paths: { input_file => 'pfx.' }, output_path: output_file) + merged = File.read(output_file) + expect(merged).to include('"pfx.CFBundleName" = WordPress;') + expect(merged).to include('"pfx.com.automattic.app-id" = "X";') + expect(merged).to include('"pfx.QuotedKey" = "Y";') + end + end + it 'returns duplicate keys found' do paths = { fixture('Localizable-utf16.strings') => nil, fixture('non-latin-utf16.strings') => nil } Dir.mktmpdir('a8c-release-toolkit-l10n-helper-tests-') do |tmp_dir| diff --git a/spec/ios_lint_localizations_spec.rb b/spec/ios_lint_localizations_spec.rb index af357ab3a..737eaa6a5 100644 --- a/spec/ios_lint_localizations_spec.rb +++ b/spec/ios_lint_localizations_spec.rb @@ -228,4 +228,54 @@ def run_l10n_linter_test(data_file:, check_duplicate_keys: nil, fail_on_strings_ ) end end + + # Regression coverage for the unquoted-key/value parsing in `StringsFileValidationHelper`, which used + # to raise `Invalid character` on *unquoted* keys (valid `.strings` syntax, common in `InfoPlist.strings`) + # and so crashed `check_duplicate_keys`. It now parses unquoted keys and detects duplicates among them. + context 'duplicate-key detection on files with unquoted keys' do + let(:input_dir) { Dir.mktmpdir('a8c-lint-l10n-unquoted-') } + + after { FileUtils.remove_entry(input_dir) } + + def write_localizable(lang, content) + lproj = File.join(input_dir, "#{lang}.lproj") + FileUtils.mkdir_p(lproj) + File.write(File.join(lproj, 'Localizable.strings'), content) + end + + it 'parses unquoted keys instead of raising, and detects duplicates among them' do + write_localizable('en', <<~STRINGS) + okay_key = "value"; + NSCameraUsageDescription = "Take photos"; + NSCameraUsageDescription = "Take a photo"; + "quoted_key" = "value"; + STRINGS + + result = nil + expect { result = described_class.find_duplicated_keys({ input_dir: input_dir }) }.not_to raise_error + expect(result).to eq('en' => ['`NSCameraUsageDescription` was found at multiple lines: 2, 3']) + end + + it 'reports no duplicates when unquoted keys are unique (mixed with quoted keys)' do + write_localizable('en', <<~STRINGS) + CFBundleName = "WordPress"; + NSCameraUsageDescription = "Take photos"; + "quoted_key" = "value"; + STRINGS + + expect(described_class.find_duplicated_keys({ input_dir: input_dir })).to eq({}) + end + + it 'warns and skips (without crashing) a file that parses as a plist but is not a flat `.strings`' do + # A nested-dictionary value is a valid old-style plist that `plutil` accepts but the duplicate-key + # scanner can't tokenize. This used to crash the lane (the scanner raised, uncaught); now it is + # surfaced via `UI.important` and the file is skipped. + write_localizable('en', "\"k\" = { a = b; };\n") + expect(FastlaneCore::UI).to receive(:important).with(/Could not check .* for duplicate keys/) + + result = nil + expect { result = described_class.find_duplicated_keys({ input_dir: input_dir }) }.not_to raise_error + expect(result).to eq({}) + end + end end diff --git a/spec/ios_strings_file_validation_helper_spec.rb b/spec/ios_strings_file_validation_helper_spec.rb index a183f682a..c7c2f646c 100644 --- a/spec/ios_strings_file_validation_helper_spec.rb +++ b/spec/ios_strings_file_validation_helper_spec.rb @@ -54,14 +54,117 @@ end end - context 'when there are unquoted keys' do - it 'returns an error' do - # Unquoted strings are currently not supported by our validation helper in its current state, despite being a valid syntax, because we considered - # that it was not worth adding complexity to our state machine logic for this use case — we expect all the `.strings` files we plan to validate will - # come from GlotPress exports, and will thus always have their keys quoted. - # If support for unquoted strings is added to our validation helper in the future, feel free to update this test example accordingly. - expect { described_class.find_duplicated_keys(file: File.join(test_data_dir, 'ios_l10n_helper', 'expected-merged.strings')) } - .to raise_error(RuntimeError, 'Invalid character `I` found on line 21, col 1') + context 'when there are unquoted keys and values' do + # Unquoted strings — keys and values — are valid `.strings` syntax (the old-style ASCII plist format) + # and are common in `InfoPlist.strings` (e.g. `CFBundleDisplayName = WordPress;`). + it 'parses them alongside quoted keys without raising, finding no duplicates among unique keys' do + # `expected-merged.strings` mixes quoted (`key1`–`key3`) and unquoted (`InfoKey1`–`InfoKey3`) keys. + expect(described_class.find_duplicated_keys(file: File.join(test_data_dir, 'ios_l10n_helper', 'expected-merged.strings'))).to be_empty + end + + it 'detects duplicates among unquoted keys, reporting each occurrence line' do + content = <<~STRINGS + CFBundleName = "WordPress"; + NSCameraUsageDescription = "Take photos"; + CFBundleName = "Jetpack"; + STRINGS + with_tmp_file(named: 'InfoPlist.strings', content: content) do |path| + expect(described_class.find_duplicated_keys(file: path)).to eq('CFBundleName' => [1, 3]) + end + end + + it 'parses unquoted *values* (not just keys) without raising, and finds duplicates among them' do + # `CFBundleName = WordPress;` (both key and value unquoted) is valid ASCII-plist that `plutil` + # parses; the scanner must tokenize it rather than choking on the unquoted value. + content = <<~STRINGS + CFBundleName = WordPress; + CFBundleShortVersionString = 1.0; + CFBundleName = Jetpack; + STRINGS + with_tmp_file(named: 'InfoPlist.strings', content: content) do |path| + expect(described_class.find_duplicated_keys(file: path)).to eq('CFBundleName' => [1, 3]) + end + end + + it 'parses unquoted keys containing `.`, `-`, `_`, `$`, `:`, and `/` (the chars `plutil` allows)' do + content = <<~STRINGS + com.example.app-name_2 = "v"; + a$b:c/d = "v"; + a$b:c/d = "w"; + STRINGS + with_tmp_file(named: 'InfoPlist.strings', content: content) do |path| + expect(described_class.find_duplicated_keys(file: path)).to eq('a$b:c/d' => [2, 3]) + end + end + end + + context 'when comments appear between the tokens of a statement' do + # Comments are valid `.strings` syntax not only on their own line but also *between* the tokens of a + # statement — after a key, around the `=`, or before the terminating `;`. `plutil` accepts all of these, + # so the scanner must tokenize them rather than raising `Invalid character` on the `/`. + it 'parses comments after a key, around the `=`, and before the `;` without raising' do + content = <<~STRINGS + "afterKey" /* note */ = "1"; + "aroundEq" = /* note */ "2"; + "beforeSemicolon" = "3" /* note */; + unquotedKey /* note */ = unquotedValue; + STRINGS + with_tmp_file(named: 'Localizable.strings', content: content) do |path| + expect(described_class.find_duplicated_keys(file: path)).to be_empty + end + end + + it 'still detects duplicate keys in a file that also contains inline comments' do + content = <<~STRINGS + "dup" /* first */ = "1"; + "unique" = "x"; + "dup" = "2" /* second */; + STRINGS + with_tmp_file(named: 'Localizable.strings', content: content) do |path| + expect(described_class.find_duplicated_keys(file: path)).to eq('dup' => [1, 3]) + end + end + + it 'does not mistake a `/`-leading unquoted value for the start of a comment' do + # A `/` right after `=` may begin a comment OR an unquoted value (e.g. a path or URL); the latter, + # which `plutil` accepts, must still parse rather than be swallowed as a comment. + content = <<~STRINGS + "path" = /usr/bin/tool; + "url" = https://example.com/x; + "path" = /opt; + STRINGS + with_tmp_file(named: 'Localizable.strings', content: content) do |path| + expect(described_class.find_duplicated_keys(file: path)).to eq('path' => [1, 3]) + end + end + end + + describe '.scan_for_duplicate_keys' do + it 'returns `[:scanned, duplicates]` for a `:text` file, with the duplicate hash (empty if none)' do + with_tmp_file(named: 'dups.strings', content: "\"k\" = \"a\";\n\"k\" = \"b\";\n") do |path| + expect(described_class.scan_for_duplicate_keys(file: path)).to eq([:scanned, { 'k' => [1, 2] }]) + end + with_tmp_file(named: 'unique.strings', content: "\"k\" = \"a\";\n\"j\" = \"b\";\n") do |path| + expect(described_class.scan_for_duplicate_keys(file: path)).to eq([:scanned, {}]) + end + end + + it 'returns `[:unsupported_format, format]` for a non-`:text` (XML) plist, without scanning' do + in_tmp_dir do |dir| + path = File.join(dir, 'x.strings') + Fastlane::Helper::Ios::L10nHelper.generate_strings_file_from_hash(translations: { 'a' => 'b' }, output_path: path) + status, format = described_class.scan_for_duplicate_keys(file: path) + expect(status).to eq(:unsupported_format) + expect(format).to eq(:xml) + end + end + + it 'returns `[:unscannable, message]` for a `:text` plist the tokenizer cannot read' do + with_tmp_file(named: 'nested.strings', content: "\"k\" = { a = b; };\n") do |path| + status, message = described_class.scan_for_duplicate_keys(file: path) + expect(status).to eq(:unscannable) + expect(message).to match(/Invalid character/) + end end end end