How to use regex to find all overlapping matches Ask Question

How to use regex to find all overlapping matches Ask Question

I'm trying to find every 10 digit series of numbers within a larger series of numbers using re in Python 2.6.

I'm easily able to grab no overlapping matches, but I want every match in the number series. Eg.

in "123456789123456789"

I should get the following list:

[1234567891,2345678912,3456789123,4567891234,5678912345,6789123456,7891234567,8912345678,9123456789]

I've found references to a "lookahead", but the examples I've seen only show pairs of numbers rather than larger groupings and I haven't been able to convert them beyond the two digits.

ベストアンサー1

Use a capturing group inside a lookahead. The lookahead captures the text you're interested in, but the actual match is technically the zero-width substring before the lookahead, so the matches are technically non-overlapping:

import re 
s = "123456789123456789"
matches = re.finditer(r'(?=(\d{10}))', s)
results = [int(match.group(1)) for match in matches]
# results: 
# [1234567891,
#  2345678912,
#  3456789123,
#  4567891234,
#  5678912345,
#  6789123456,
#  7891234567,
#  8912345678,
#  9123456789]

おすすめ記事