2017/01/30 修正

Question

私はperl次のようなものを使います：

perl -MFile::Find -MClone=clone -lne '
  # parse the strings.txt input, here looking for the sequences of
  # 0 or more characters (.*?) in between two " characters
  for (/"(.*?)"/g) {
    # @needle is an array of associative arrays whose keys
    # are the "strings" for each line.
    $needle[$n]{$_} = undef;
  }
  $n++;

  END{
    sub wanted {
      return unless -f; # only regular files
      my $needle_clone = clone(\@needle);
      if (open FILE, "<", $_) {
        LINE: while (<FILE>) {
          # read the file line by line
          for (my $i = 0; $i < $n; $i++) {
            for my $s (keys %{$needle_clone->[$i]}) {
              if (index($_, $s)>=0) {
                # if the string is found, we delete it from the associative
                # array.
                delete $needle_clone->[$i]{$s};
                unless (%{$needle_clone->[$i]}) {
                  # if the associative array is empty, that means we have
                  # found all the strings for that $i, that means we can
                  # stop processing, and the file matches
                  print $File::Find::name;
                  last LINE;
                }
              }
            }
          }
        }
        close FILE;
      }
    }
    find(\&wanted, ".")
  }' /path/to/strings.txt

これは、文字列検索の回数を最小限に抑えることを意味します。

ここではファイルを1行ずつ処理します。ファイルが非常に小さい場合、全体的に処理すると作業が少し簡単になり、パフォーマンスが向上する可能性があります。

リストファイルは次の場所にあると予想されます。

 "surveillance data" "surveillance technology" "cctv camera"
 "social media" "surveillance techniques" "enforcement agencies"
 "social control" "surveillance camera" "social security"
 "surveillance data" "security guards" "social networking"
 "surveillance mechanisms" "cctv surveillance" "contemporary surveillance"

形式に応じて、各行には引用符（二重引用符を含む）で囲まれた特定の数（必ずしも3である必要はありません）の文字列があります。引用符付き文字列自体には二重引用符文字を含めることはできません。二重引用符文字は、検索中のテキストの一部ではありません。つまり、リストファイルに次のものが含まれている場合：

"A" "B"
"1" "2" "3"

これは、現在のディレクトリと次のいずれかを含むその下のすべての一般ファイルへのパスを報告します。

A両方B
または（独占または) すべて1と23

どこでも。

Answer 1

私はperl次のようなものを使います：

perl -MFile::Find -MClone=clone -lne '
  # parse the strings.txt input, here looking for the sequences of
  # 0 or more characters (.*?) in between two " characters
  for (/"(.*?)"/g) {
    # @needle is an array of associative arrays whose keys
    # are the "strings" for each line.
    $needle[$n]{$_} = undef;
  }
  $n++;

  END{
    sub wanted {
      return unless -f; # only regular files
      my $needle_clone = clone(\@needle);
      if (open FILE, "<", $_) {
        LINE: while (<FILE>) {
          # read the file line by line
          for (my $i = 0; $i < $n; $i++) {
            for my $s (keys %{$needle_clone->[$i]}) {
              if (index($_, $s)>=0) {
                # if the string is found, we delete it from the associative
                # array.
                delete $needle_clone->[$i]{$s};
                unless (%{$needle_clone->[$i]}) {
                  # if the associative array is empty, that means we have
                  # found all the strings for that $i, that means we can
                  # stop processing, and the file matches
                  print $File::Find::name;
                  last LINE;
                }
              }
            }
          }
        }
        close FILE;
      }
    }
    find(\&wanted, ".")
  }' /path/to/strings.txt

これは、文字列検索の回数を最小限に抑えることを意味します。

ここではファイルを1行ずつ処理します。ファイルが非常に小さい場合、全体的に処理すると作業が少し簡単になり、パフォーマンスが向上する可能性があります。

リストファイルは次の場所にあると予想されます。

 "surveillance data" "surveillance technology" "cctv camera"
 "social media" "surveillance techniques" "enforcement agencies"
 "social control" "surveillance camera" "social security"
 "surveillance data" "security guards" "social networking"
 "surveillance mechanisms" "cctv surveillance" "contemporary surveillance"

形式に応じて、各行には引用符（二重引用符を含む）で囲まれた特定の数（必ずしも3である必要はありません）の文字列があります。引用符付き文字列自体には二重引用符文字を含めることはできません。二重引用符文字は、検索中のテキストの一部ではありません。つまり、リストファイルに次のものが含まれている場合：

"A" "B"
"1" "2" "3"

これは、現在のディレクトリと次のいずれかを含むその下のすべての一般ファイルへのパスを報告します。

A両方B
または（独占または) すべて1と23

どこでも。

2017/01/30 修正

2017/01/30 修正

2017/01/29 修正

ベストアンサー1

おすすめ記事