クリスマスツリーの見分け方は？質問する

Question

私は、他の方法とは少し違っていて興味深いアプローチをしています。他の方法と比べた私のアプローチの主な違いは、画像セグメンテーションのステップの実行方法です。DBスキャンPython の scikit-learn のクラスタリングアルゴリズム。必ずしも単一の明確な重心を持たない、やや不定形な形状を見つけるために最適化されています。

上位レベルでは、私のアプローチはかなり単純で、約 3 つのステップに分けることができます。まず、しきい値 (または、実際には、2 つの別個の異なるしきい値の論理「または」) を適用します。他の多くの回答と同様に、クリスマスツリーはシーン内で最も明るいオブジェクトの 1 つであると想定したため、最初のしきい値は単純なモノクロの明るさテストです。0 ～ 255 のスケール (黒は 0、白は 255) で 220 を超える値を持つピクセルは、バイナリの白黒画像に保存されます。2 番目のしきい値は、赤と黄色の光を探します。これらの光は、6 つの画像の左上と右下のツリーで特に目立ち、ほとんどの写真でよく見られる青緑色の背景に対してよく目立ちます。私は RGB 画像を HSV 空間に変換し、色相が 0.0 ～ 1.0 スケールで 0.2 未満 (黄色と緑の境界にほぼ相当) または 0.95 以上 (紫と赤の境界に相当) であること、さらに、明るく飽和した色であること (彩度と値が両方とも 0.7 以上であること) を要求します。2 つのしきい値手順の結果は論理的に「OR」で結合され、結果として得られる白黒のバイナリ画像のマトリックスを以下に示します。

HSVとモノクロの明るさの閾値を設定した後のクリスマスツリー

各画像には、各木の位置とほぼ一致する 1 つの大きなピクセルクラスターがあり、さらにいくつかの画像には、建物の窓の明かりや地平線の背景のシーンに対応する小さなクラスターもいくつかあることがはっきりとわかります。次の手順では、これらが個別のクラスターであることをコンピューターに認識させ、各ピクセルにクラスターメンバーシップ ID 番号を正しくラベル付けします。

この課題のために私が選んだのはDBスキャンDBSCANが他のクラスタリングアルゴリズムと比較してどのように動作するかを視覚的に比較した非常に良い例があります。ここ先ほど述べたように、この方法は不定形の形状に適しています。各クラスターが異なる色でプロットされた DBSCAN の出力を以下に示します。

DBSCANクラスタリング出力

この結果を見るときに注意すべき点がいくつかあります。まず、DBSCAN の動作を制御するために、ユーザーは「近接」パラメータを設定する必要があります。このパラメータは、既存のクラスターにテストポイントを集約するのではなく、アルゴリズムが新しい個別のクラスターを宣言するために、2 つのポイントがどの程度離れている必要があるかを効果的に制御します。この値は、各画像の対角線に沿ったサイズの 0.04 倍に設定しました。画像のサイズは、およそ VGA から HD 1080 までさまざまであるため、このようなスケール相対定義は重要です。

注目すべきもう 1 つの点は、scikit-learn に実装されている DBSCAN アルゴリズムにはメモリ制限があり、このサンプルの大きな画像の一部ではかなり難しいということです。そのため、いくつかの大きな画像では、この制限内に収めるために、各クラスターを実際に「デシメート」する (つまり、3 つ目または 4 つ目のピクセルだけを保持し、他のピクセルを削除する) 必要がありました。このカリングプロセスの結果、一部の大きな画像では、残っている個々のスパースピクセルが見えにくくなっています。そのため、表示目的のみで、上記の画像の色分けされたピクセルは、目立つようにわずかに「拡大」されています。これは、説明のための単なる装飾的な操作です。コードにはこの拡大について言及しているコメントがありますが、実際に重要な計算とはまったく関係がないのでご安心ください。

クラスターが識別され、ラベル付けされると、3 番目で最後のステップは簡単です。各画像で最大のクラスターを選択し (この場合、メンバーピクセルの合計数で「サイズ」を測定することを選択しましたが、物理的な範囲を測定する何らかのメトリックを使用しても同じように簡単に済みます)、そのクラスターの凸包を計算します。凸包はツリーの境界になります。この方法で計算された 6 つの凸包は、以下に赤で示されています。

計算された境界線を持つクリスマスツリー

ソースコードはPython 2.7.6用に書かれており、ナンピー、スキピー、マットプロットそしてサイキットラーン私はそれを 2 つの部分に分けました。最初の部分は実際の画像処理を担当します。

from PIL import Image
import numpy as np
import scipy as sp
import matplotlib.colors as colors
from sklearn.cluster import DBSCAN
from math import ceil, sqrt

"""
Inputs:

    rgbimg:         [M,N,3] numpy array containing (uint, 0-255) color image

    hueleftthr:     Scalar constant to select maximum allowed hue in the
                    yellow-green region

    huerightthr:    Scalar constant to select minimum allowed hue in the
                    blue-purple region

    satthr:         Scalar constant to select minimum allowed saturation

    valthr:         Scalar constant to select minimum allowed value

    monothr:        Scalar constant to select minimum allowed monochrome
                    brightness

    maxpoints:      Scalar constant maximum number of pixels to forward to
                    the DBSCAN clustering algorithm

    proxthresh:     Proximity threshold to use for DBSCAN, as a fraction of
                    the diagonal size of the image

Outputs:

    borderseg:      [K,2,2] Nested list containing K pairs of x- and y- pixel
                    values for drawing the tree border

    X:              [P,2] List of pixels that passed the threshold step

    labels:         [Q,2] List of cluster labels for points in Xslice (see
                    below)

    Xslice:         [Q,2] Reduced list of pixels to be passed to DBSCAN

"""

def findtree(rgbimg, hueleftthr=0.2, huerightthr=0.95, satthr=0.7, 
             valthr=0.7, monothr=220, maxpoints=5000, proxthresh=0.04):

    # Convert rgb image to monochrome for
    gryimg = np.asarray(Image.fromarray(rgbimg).convert('L'))
    # Convert rgb image (uint, 0-255) to hsv (float, 0.0-1.0)
    hsvimg = colors.rgb_to_hsv(rgbimg.astype(float)/255)

    # Initialize binary thresholded image
    binimg = np.zeros((rgbimg.shape[0], rgbimg.shape[1]))
    # Find pixels with hue<0.2 or hue>0.95 (red or yellow) and saturation/value
    # both greater than 0.7 (saturated and bright)--tends to coincide with
    # ornamental lights on trees in some of the images
    boolidx = np.logical_and(
                np.logical_and(
                  np.logical_or((hsvimg[:,:,0] < hueleftthr),
                                (hsvimg[:,:,0] > huerightthr)),
                                (hsvimg[:,:,1] > satthr)),
                                (hsvimg[:,:,2] > valthr))
    # Find pixels that meet hsv criterion
    binimg[np.where(boolidx)] = 255
    # Add pixels that meet grayscale brightness criterion
    binimg[np.where(gryimg > monothr)] = 255

    # Prepare thresholded points for DBSCAN clustering algorithm
    X = np.transpose(np.where(binimg == 255))
    Xslice = X
    nsample = len(Xslice)
    if nsample > maxpoints:
        # Make sure number of points does not exceed DBSCAN maximum capacity
        Xslice = X[range(0,nsample,int(ceil(float(nsample)/maxpoints)))]

    # Translate DBSCAN proximity threshold to units of pixels and run DBSCAN
    pixproxthr = proxthresh * sqrt(binimg.shape[0]**2 + binimg.shape[1]**2)
    db = DBSCAN(eps=pixproxthr, min_samples=10).fit(Xslice)
    labels = db.labels_.astype(int)

    # Find the largest cluster (i.e., with most points) and obtain convex hull   
    unique_labels = set(labels)
    maxclustpt = 0
    for k in unique_labels:
        class_members = [index[0] for index in np.argwhere(labels == k)]
        if len(class_members) > maxclustpt:
            points = Xslice[class_members]
            hull = sp.spatial.ConvexHull(points)
            maxclustpt = len(class_members)
            borderseg = [[points[simplex,0], points[simplex,1]] for simplex
                          in hull.simplices]

    return borderseg, X, labels, Xslice

2 番目の部分は、最初のファイルを呼び出して上記のすべてのプロットを生成するユーザーレベルのスクリプトです。

#!/usr/bin/env python

from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
from findtree import findtree

# Image files to process
fname = ['nmzwj.png', 'aVZhC.png', '2K9EF.png',
         'YowlH.png', '2y4o5.png', 'FWhSP.png']

# Initialize figures
fgsz = (16,7)        
figthresh = plt.figure(figsize=fgsz, facecolor='w')
figclust  = plt.figure(figsize=fgsz, facecolor='w')
figcltwo  = plt.figure(figsize=fgsz, facecolor='w')
figborder = plt.figure(figsize=fgsz, facecolor='w')
figthresh.canvas.set_window_title('Thresholded HSV and Monochrome Brightness')
figclust.canvas.set_window_title('DBSCAN Clusters (Raw Pixel Output)')
figcltwo.canvas.set_window_title('DBSCAN Clusters (Slightly Dilated for Display)')
figborder.canvas.set_window_title('Trees with Borders')

for ii, name in zip(range(len(fname)), fname):
    # Open the file and convert to rgb image
    rgbimg = np.asarray(Image.open(name))

    # Get the tree borders as well as a bunch of other intermediate values
    # that will be used to illustrate how the algorithm works
    borderseg, X, labels, Xslice = findtree(rgbimg)

    # Display thresholded images
    axthresh = figthresh.add_subplot(2,3,ii+1)
    axthresh.set_xticks([])
    axthresh.set_yticks([])
    binimg = np.zeros((rgbimg.shape[0], rgbimg.shape[1]))
    for v, h in X:
        binimg[v,h] = 255
    axthresh.imshow(binimg, interpolation='nearest', cmap='Greys')

    # Display color-coded clusters
    axclust = figclust.add_subplot(2,3,ii+1) # Raw version
    axclust.set_xticks([])
    axclust.set_yticks([])
    axcltwo = figcltwo.add_subplot(2,3,ii+1) # Dilated slightly for display only
    axcltwo.set_xticks([])
    axcltwo.set_yticks([])
    axcltwo.imshow(binimg, interpolation='nearest', cmap='Greys')
    clustimg = np.ones(rgbimg.shape)    
    unique_labels = set(labels)
    # Generate a unique color for each cluster 
    plcol = cm.rainbow_r(np.linspace(0, 1, len(unique_labels)))
    for lbl, pix in zip(labels, Xslice):
        for col, unqlbl in zip(plcol, unique_labels):
            if lbl == unqlbl:
                # Cluster label of -1 indicates no cluster membership;
                # override default color with black
                if lbl == -1:
                    col = [0.0, 0.0, 0.0, 1.0]
                # Raw version
                for ij in range(3):
                    clustimg[pix[0],pix[1],ij] = col[ij]
                # Dilated just for display
                axcltwo.plot(pix[1], pix[0], 'o', markerfacecolor=col, 
                    markersize=1, markeredgecolor=col)
    axclust.imshow(clustimg)
    axcltwo.set_xlim(0, binimg.shape[1]-1)
    axcltwo.set_ylim(binimg.shape[0], -1)

    # Plot original images with read borders around the trees
    axborder = figborder.add_subplot(2,3,ii+1)
    axborder.set_axis_off()
    axborder.imshow(rgbimg, interpolation='nearest')
    for vseg, hseg in borderseg:
        axborder.plot(hseg, vseg, 'r-', lw=3)
    axborder.set_xlim(0, binimg.shape[1]-1)
    axborder.set_ylim(binimg.shape[0], -1)

plt.show()

Answer 1

私は、他の方法とは少し違っていて興味深いアプローチをしています。他の方法と比べた私のアプローチの主な違いは、画像セグメンテーションのステップの実行方法です。DBスキャンPython の scikit-learn のクラスタリングアルゴリズム。必ずしも単一の明確な重心を持たない、やや不定形な形状を見つけるために最適化されています。

上位レベルでは、私のアプローチはかなり単純で、約 3 つのステップに分けることができます。まず、しきい値 (または、実際には、2 つの別個の異なるしきい値の論理「または」) を適用します。他の多くの回答と同様に、クリスマスツリーはシーン内で最も明るいオブジェクトの 1 つであると想定したため、最初のしきい値は単純なモノクロの明るさテストです。0 ～ 255 のスケール (黒は 0、白は 255) で 220 を超える値を持つピクセルは、バイナリの白黒画像に保存されます。2 番目のしきい値は、赤と黄色の光を探します。これらの光は、6 つの画像の左上と右下のツリーで特に目立ち、ほとんどの写真でよく見られる青緑色の背景に対してよく目立ちます。私は RGB 画像を HSV 空間に変換し、色相が 0.0 ～ 1.0 スケールで 0.2 未満 (黄色と緑の境界にほぼ相当) または 0.95 以上 (紫と赤の境界に相当) であること、さらに、明るく飽和した色であること (彩度と値が両方とも 0.7 以上であること) を要求します。2 つのしきい値手順の結果は論理的に「OR」で結合され、結果として得られる白黒のバイナリ画像のマトリックスを以下に示します。

HSVとモノクロの明るさの閾値を設定した後のクリスマスツリー

各画像には、各木の位置とほぼ一致する 1 つの大きなピクセルクラスターがあり、さらにいくつかの画像には、建物の窓の明かりや地平線の背景のシーンに対応する小さなクラスターもいくつかあることがはっきりとわかります。次の手順では、これらが個別のクラスターであることをコンピューターに認識させ、各ピクセルにクラスターメンバーシップ ID 番号を正しくラベル付けします。

この課題のために私が選んだのはDBスキャンDBSCANが他のクラスタリングアルゴリズムと比較してどのように動作するかを視覚的に比較した非常に良い例があります。ここ先ほど述べたように、この方法は不定形の形状に適しています。各クラスターが異なる色でプロットされた DBSCAN の出力を以下に示します。

DBSCANクラスタリング出力

この結果を見るときに注意すべき点がいくつかあります。まず、DBSCAN の動作を制御するために、ユーザーは「近接」パラメータを設定する必要があります。このパラメータは、既存のクラスターにテストポイントを集約するのではなく、アルゴリズムが新しい個別のクラスターを宣言するために、2 つのポイントがどの程度離れている必要があるかを効果的に制御します。この値は、各画像の対角線に沿ったサイズの 0.04 倍に設定しました。画像のサイズは、およそ VGA から HD 1080 までさまざまであるため、このようなスケール相対定義は重要です。

注目すべきもう 1 つの点は、scikit-learn に実装されている DBSCAN アルゴリズムにはメモリ制限があり、このサンプルの大きな画像の一部ではかなり難しいということです。そのため、いくつかの大きな画像では、この制限内に収めるために、各クラスターを実際に「デシメート」する (つまり、3 つ目または 4 つ目のピクセルだけを保持し、他のピクセルを削除する) 必要がありました。このカリングプロセスの結果、一部の大きな画像では、残っている個々のスパースピクセルが見えにくくなっています。そのため、表示目的のみで、上記の画像の色分けされたピクセルは、目立つようにわずかに「拡大」されています。これは、説明のための単なる装飾的な操作です。コードにはこの拡大について言及しているコメントがありますが、実際に重要な計算とはまったく関係がないのでご安心ください。

クラスターが識別され、ラベル付けされると、3 番目で最後のステップは簡単です。各画像で最大のクラスターを選択し (この場合、メンバーピクセルの合計数で「サイズ」を測定することを選択しましたが、物理的な範囲を測定する何らかのメトリックを使用しても同じように簡単に済みます)、そのクラスターの凸包を計算します。凸包はツリーの境界になります。この方法で計算された 6 つの凸包は、以下に赤で示されています。

計算された境界線を持つクリスマスツリー

ソースコードはPython 2.7.6用に書かれており、ナンピー、スキピー、マットプロットそしてサイキットラーン私はそれを 2 つの部分に分けました。最初の部分は実際の画像処理を担当します。

from PIL import Image
import numpy as np
import scipy as sp
import matplotlib.colors as colors
from sklearn.cluster import DBSCAN
from math import ceil, sqrt

"""
Inputs:

    rgbimg:         [M,N,3] numpy array containing (uint, 0-255) color image

    hueleftthr:     Scalar constant to select maximum allowed hue in the
                    yellow-green region

    huerightthr:    Scalar constant to select minimum allowed hue in the
                    blue-purple region

    satthr:         Scalar constant to select minimum allowed saturation

    valthr:         Scalar constant to select minimum allowed value

    monothr:        Scalar constant to select minimum allowed monochrome
                    brightness

    maxpoints:      Scalar constant maximum number of pixels to forward to
                    the DBSCAN clustering algorithm

    proxthresh:     Proximity threshold to use for DBSCAN, as a fraction of
                    the diagonal size of the image

Outputs:

    borderseg:      [K,2,2] Nested list containing K pairs of x- and y- pixel
                    values for drawing the tree border

    X:              [P,2] List of pixels that passed the threshold step

    labels:         [Q,2] List of cluster labels for points in Xslice (see
                    below)

    Xslice:         [Q,2] Reduced list of pixels to be passed to DBSCAN

"""

def findtree(rgbimg, hueleftthr=0.2, huerightthr=0.95, satthr=0.7, 
             valthr=0.7, monothr=220, maxpoints=5000, proxthresh=0.04):

    # Convert rgb image to monochrome for
    gryimg = np.asarray(Image.fromarray(rgbimg).convert('L'))
    # Convert rgb image (uint, 0-255) to hsv (float, 0.0-1.0)
    hsvimg = colors.rgb_to_hsv(rgbimg.astype(float)/255)

    # Initialize binary thresholded image
    binimg = np.zeros((rgbimg.shape[0], rgbimg.shape[1]))
    # Find pixels with hue<0.2 or hue>0.95 (red or yellow) and saturation/value
    # both greater than 0.7 (saturated and bright)--tends to coincide with
    # ornamental lights on trees in some of the images
    boolidx = np.logical_and(
                np.logical_and(
                  np.logical_or((hsvimg[:,:,0] < hueleftthr),
                                (hsvimg[:,:,0] > huerightthr)),
                                (hsvimg[:,:,1] > satthr)),
                                (hsvimg[:,:,2] > valthr))
    # Find pixels that meet hsv criterion
    binimg[np.where(boolidx)] = 255
    # Add pixels that meet grayscale brightness criterion
    binimg[np.where(gryimg > monothr)] = 255

    # Prepare thresholded points for DBSCAN clustering algorithm
    X = np.transpose(np.where(binimg == 255))
    Xslice = X
    nsample = len(Xslice)
    if nsample > maxpoints:
        # Make sure number of points does not exceed DBSCAN maximum capacity
        Xslice = X[range(0,nsample,int(ceil(float(nsample)/maxpoints)))]

    # Translate DBSCAN proximity threshold to units of pixels and run DBSCAN
    pixproxthr = proxthresh * sqrt(binimg.shape[0]**2 + binimg.shape[1]**2)
    db = DBSCAN(eps=pixproxthr, min_samples=10).fit(Xslice)
    labels = db.labels_.astype(int)

    # Find the largest cluster (i.e., with most points) and obtain convex hull   
    unique_labels = set(labels)
    maxclustpt = 0
    for k in unique_labels:
        class_members = [index[0] for index in np.argwhere(labels == k)]
        if len(class_members) > maxclustpt:
            points = Xslice[class_members]
            hull = sp.spatial.ConvexHull(points)
            maxclustpt = len(class_members)
            borderseg = [[points[simplex,0], points[simplex,1]] for simplex
                          in hull.simplices]

    return borderseg, X, labels, Xslice

2 番目の部分は、最初のファイルを呼び出して上記のすべてのプロットを生成するユーザーレベルのスクリプトです。

#!/usr/bin/env python

from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
from findtree import findtree

# Image files to process
fname = ['nmzwj.png', 'aVZhC.png', '2K9EF.png',
         'YowlH.png', '2y4o5.png', 'FWhSP.png']

# Initialize figures
fgsz = (16,7)        
figthresh = plt.figure(figsize=fgsz, facecolor='w')
figclust  = plt.figure(figsize=fgsz, facecolor='w')
figcltwo  = plt.figure(figsize=fgsz, facecolor='w')
figborder = plt.figure(figsize=fgsz, facecolor='w')
figthresh.canvas.set_window_title('Thresholded HSV and Monochrome Brightness')
figclust.canvas.set_window_title('DBSCAN Clusters (Raw Pixel Output)')
figcltwo.canvas.set_window_title('DBSCAN Clusters (Slightly Dilated for Display)')
figborder.canvas.set_window_title('Trees with Borders')

for ii, name in zip(range(len(fname)), fname):
    # Open the file and convert to rgb image
    rgbimg = np.asarray(Image.open(name))

    # Get the tree borders as well as a bunch of other intermediate values
    # that will be used to illustrate how the algorithm works
    borderseg, X, labels, Xslice = findtree(rgbimg)

    # Display thresholded images
    axthresh = figthresh.add_subplot(2,3,ii+1)
    axthresh.set_xticks([])
    axthresh.set_yticks([])
    binimg = np.zeros((rgbimg.shape[0], rgbimg.shape[1]))
    for v, h in X:
        binimg[v,h] = 255
    axthresh.imshow(binimg, interpolation='nearest', cmap='Greys')

    # Display color-coded clusters
    axclust = figclust.add_subplot(2,3,ii+1) # Raw version
    axclust.set_xticks([])
    axclust.set_yticks([])
    axcltwo = figcltwo.add_subplot(2,3,ii+1) # Dilated slightly for display only
    axcltwo.set_xticks([])
    axcltwo.set_yticks([])
    axcltwo.imshow(binimg, interpolation='nearest', cmap='Greys')
    clustimg = np.ones(rgbimg.shape)    
    unique_labels = set(labels)
    # Generate a unique color for each cluster 
    plcol = cm.rainbow_r(np.linspace(0, 1, len(unique_labels)))
    for lbl, pix in zip(labels, Xslice):
        for col, unqlbl in zip(plcol, unique_labels):
            if lbl == unqlbl:
                # Cluster label of -1 indicates no cluster membership;
                # override default color with black
                if lbl == -1:
                    col = [0.0, 0.0, 0.0, 1.0]
                # Raw version
                for ij in range(3):
                    clustimg[pix[0],pix[1],ij] = col[ij]
                # Dilated just for display
                axcltwo.plot(pix[1], pix[0], 'o', markerfacecolor=col, 
                    markersize=1, markeredgecolor=col)
    axclust.imshow(clustimg)
    axcltwo.set_xlim(0, binimg.shape[1]-1)
    axcltwo.set_ylim(binimg.shape[0], -1)

    # Plot original images with read borders around the trees
    axborder = figborder.add_subplot(2,3,ii+1)
    axborder.set_axis_off()
    axborder.imshow(rgbimg, interpolation='nearest')
    for vseg, hseg in borderseg:
        axborder.plot(hseg, vseg, 'r-', lw=3)
    axborder.set_xlim(0, binimg.shape[1]-1)
    axborder.set_ylim(binimg.shape[0], -1)

plt.show()

クリスマスツリーの見分け方は？質問する

ベストアンサー1

おすすめ記事