ファイルの各行ペア間の類似性を比較するか、距離を編集しますか？

Question

私はLevenshtein通りに慣れていませんが、Perlは編集距離計算モジュールしたがって、入力から各線の組み合わせのペアの距離を計算し、「親X」（N）パラメータの影響を受けて「距離」を増やして印刷する単純なPerlスクリプトを作成しました。

#!/usr/bin/perl -w
use strict;
use Text::Levenshtein qw(distance);
use Getopt::Std;

our $opt_n;
getopts('n:');
$opt_n ||= -1; # print all the matches if -n is not provided

my @lines=<>;
my %distances = ();

# for each combination of two lines, compute distance
foreach(my $i=0; $i <= $#lines - 1; $i++) {
  foreach(my $j=$i + 1; $j <= $#lines; $j++) {
        my $d = distance($lines[$i], $lines[$j]);
        push @{ $distances{$d} }, $lines[$i] . $lines[$j];
  }
}

# print in order of increasing distance
foreach my $d (sort { $a <=> $b } keys %distances) {
  print "At distance $d:\n" . join("\n", @{ $distances{$d} }) . "\n";
  last unless --$opt_n;
}

サンプル入力は以下を提供します。

$ ./solve.pl < input
At distance 1:
Who was the 8th president?
Who was the 9th president?

At distance 3:
What is your favorite color?
What is your favorite food?

At distance 21:
What is your favorite color?
Who was the 8th president?
What is your favorite color?
Who was the 9th president?
What is your favorite food?
Who was the 8th president?
What is your favorite food?
Who was the 9th president?

そしてオプションのパラメータを表示します。

$ ./solve.pl -n 2 < input
At distance 1:
Who was the 8th president?
Who was the 9th president?

At distance 3:
What is your favorite color?
What is your favorite food?

出力を明示的に印刷する方法はわかりませんが、必要に応じて文字列を印刷できます。

Answer 1

私はLevenshtein通りに慣れていませんが、Perlは編集距離計算モジュールしたがって、入力から各線の組み合わせのペアの距離を計算し、「親X」（N）パラメータの影響を受けて「距離」を増やして印刷する単純なPerlスクリプトを作成しました。

#!/usr/bin/perl -w
use strict;
use Text::Levenshtein qw(distance);
use Getopt::Std;

our $opt_n;
getopts('n:');
$opt_n ||= -1; # print all the matches if -n is not provided

my @lines=<>;
my %distances = ();

# for each combination of two lines, compute distance
foreach(my $i=0; $i <= $#lines - 1; $i++) {
  foreach(my $j=$i + 1; $j <= $#lines; $j++) {
        my $d = distance($lines[$i], $lines[$j]);
        push @{ $distances{$d} }, $lines[$i] . $lines[$j];
  }
}

# print in order of increasing distance
foreach my $d (sort { $a <=> $b } keys %distances) {
  print "At distance $d:\n" . join("\n", @{ $distances{$d} }) . "\n";
  last unless --$opt_n;
}

サンプル入力は以下を提供します。

$ ./solve.pl < input
At distance 1:
Who was the 8th president?
Who was the 9th president?

At distance 3:
What is your favorite color?
What is your favorite food?

At distance 21:
What is your favorite color?
Who was the 8th president?
What is your favorite color?
Who was the 9th president?
What is your favorite food?
Who was the 8th president?
What is your favorite food?
Who was the 9th president?

そしてオプションのパラメータを表示します。

$ ./solve.pl -n 2 < input
At distance 1:
Who was the 8th president?
Who was the 9th president?

At distance 3:
What is your favorite color?
What is your favorite food?

出力を明示的に印刷する方法はわかりませんが、必要に応じて文字列を印刷できます。

ファイルの各行ペア間の類似性を比較するか、距離を編集しますか？

ベストアンサー1

おすすめ記事