複雑なDIFF法

Question

このデータは、マルチレベルの連想配列として使用できます（またはPerl用語でHash-of-Hashes / HoHを参照）。ペルツカ、Perlデータ構造マニュアル）、最初のレベルのキーはノード名、2番目のレベルのキー（以下のスクリプトでは「サブキー」と呼ばれます）は関連フィールド名（可用性、状態、理由など）です。

たとえば、

#!/usr/bin/perl

use strict;

die "Usage $0 [oldfile] [newfile]\n" unless (@ARGV == 2) ;

# remember both filename args
my ($oldfile,$newfile) = @ARGV[0,1];

die "$oldfile is not readable or does not exist\n" unless -r $oldfile;
die "$newfile is not readable or does not exist\n" unless -r $newfile;

# Hash variables to hold old and new data
my (%old, %new);

# Hash reference variable pointing to the hash we want
# the main loop to populate at any given moment.
# Starts off pointing to %old, changes to %new after the
# first file reaches end-of-file.
# See https://perldoc.perl.org/perlreftut and
# https://perldoc.perl.org/perlref
my $hashref = \%old;

# variable to hold the name of the current node name as
# the records in the input files are read in.
my $node;

# read and parse input files
while(<>) {
  chomp;
  s/^\s*|\s*$//g; # strip leading and trailing whitespace
  s/\s+:\s+/ : /; # strip excess whitespace around first :

  if (/^Ltm::Node:.*\s+\((.*)\)/) {
    $node = $1;
    $hashref->{$node}{name} = $node;
  } elsif (/ : /) {
    my ($key, $val) = split / : /,$_, 2;
    $hashref->{$node}{$key} = $val
  } else {
    print STDERR "Unknown data '$_' on line $. of $ARGV\n";
  };

  if (eof) {
    close(ARGV);       # reset line counter
    $hashref = \%new;  # start populating %new instead of %old
  }
};

# compare the keys from both files
my @common_keys = ();
foreach my $k (keys %old) {
  if (exists($new{$k})) {
    push @common_keys, $k;
  } else {
    print "Node $k found in $oldfile but not in $newfile\n"
  };
};

foreach my $k (keys %new) {
  if (! exists($old{$k})) {
    print "Node $k found in $newfile but not in $oldfile\n";
  };
}

# The list of sub-keys we care about.
my @subkeys = ('Availability', 'State', 'Reason', 'Monitor',
               'Monitor Status');

# now compare sub-keys in each of the nodes
foreach my $k (@common_keys) {
  foreach my $sk (@subkeys) {
    if ($old{$k}{$sk} ne $new{$k}{$sk}) {
      printf "[%-15s %-14s] Old = \"%s\", new = \"%s\"\n", $k, $sk,
        $old{$k}{$sk}, $new{$k}{$sk};
    }
  }
}

たとえば、別の名前で保存してcompare.pl実行可能にし、chmod +x compare.pl次のように実行します。

$ ./compare.pl old.txt new.txt  
Node 10.72.12.150 found in old.txt but not in new.txt
Node 10.72.12.149 found in new.txt but not in old.txt
[10.72.12.148    State         ] Old = "enabled", new = "xenabled"
[10.72.7.122     Reason        ] Old = "Node address does not have service checking enabled", new = "xNode address does not have service checking enabled"

Ltm::Node注：両方の入力ファイルのデータは行のわずかな違いを除いて同じであるため、xいくつかのフィールドの前にを追加するためにnew.txtを編集していくつかの違いを作成する必要がありました。また、old.txtにノード10.172.12.150を追加し、new.txtに10.172.12.149を追加しました。

Perlハッシュは本質的に順序付けされていないので、実行ごとにノードの違いが異なる順序で印刷されることに注意する価値があります。配列を埋めるときにソートして%old一貫した順序を取得するのは簡単ですが、自然な@common_keysソート/バージョン管理ソートサブルーチンを実装する必要があります（またはそのいずれかを使用する必要があります）。自然選別モジュール存在するCPAN）IPアドレスが正しくソートされるようにします。この改善は読者の皆さんの練習用に残しておきます。この例ではこれは必要ありません。

印刷ステートメントを編集して、必要に応じて出力を変更できます。出力を指定しなかったため、違いを簡単に識別するために必要なものだけを印刷しました。

awkで同様のものを書くことは難しくありません（特にGNU awkは多次元配列を合理的にサポートしているので）。しかし、私はawkよりも冗長な傾向があるにもかかわらず、Perlを好む（実際には部分的に）。なぜならの）。

Answer 1