パラメータとして渡されたすべての「n」個のファイルから、そのファイルに属する各単語の発生回数をどのように計算しますか？

2024-06-28 • tag-icon

パラメータとして渡されたすべての「n」個のファイルから、そのファイルに属する各単語の発生回数をどのように計算しますか？

ファイル名のリストを引数として受け取り、別の引数ファイルの最初の引数ファイル内の各単語の発生回数を計算して報告するシェルスクリプトを探しています。

ファイルに単語が表示される回数を計算する方法を理解しています。

このトリックを使用しています。

$ tr ' ' '\n' < FILE | grep -c WORD

ファイル数のためにn詰まっています。

これが私が今まで得たものです：

#!/bin/bash

if [ $# -lt 2 ]
    then
    echo "Very less arguments bro."
fi

 search_file=`tr '\n' ' ' < $1` # Make the first file in to a sequence of words.

for other_file in "$@"
do
    if [ $other_file = $1 ]
        then 
        continue
    fi

    # Modify this file such that each space turns in to a newline
    tr ' ' '\n' < $other_file > new_temp_file

    for search_word in $search_file
    do
        word_freq=`grep -c $search_word new_temp_file`
        echo "Word=$search_word Frequency=$word_freq"
    done
done

ベストアンサー1

私はそれをします：

#! /bin/sh -
# usage: wordcount <file-with-words-to-search-for> [<file>...]
words=$(tr -s '[[:space:]]' '[\n*]' < "${1?No word list provided}" | grep .)
[ -n "$words" ] || exit

shift
for file do
  printf 'File: %s\n' "$file"
  tr -s '[[:space:]]' '[\n*]' | grep -Fxe "$words" | sort | uniq -c | sort -rn
done

（これは各ファイルで少なくとも1回以上見つかった単語の数だけを提供します）。

ベストアンサー1

おすすめ記事