Findall function#
Function findall
:
is used to search for all non-overlapping matches in string
returns:
list of strings that are described by regex if there are no groups in regex
list of strings that match with regex in the group if there is only one group in regex
list of tuples containing strings that matches with expression in the group if there are more than one group
Consider the work of findall
with an example of ‘sh mac address-table output’:
In [2]: mac_address_table = open('CAM_table.txt').read()
In [3]: print(mac_address_table)
sw1#sh mac address-table
Mac Address Table
-------------------------------------------
Vlan Mac Address Type Ports
---- ----------- -------- -----
100 a1b2.ac10.7000 DYNAMIC Gi0/1
200 a0d4.cb20.7000 DYNAMIC Gi0/2
300 acb4.cd30.7000 DYNAMIC Gi0/3
100 a2bb.ec40.7000 DYNAMIC Gi0/4
500 aa4b.c550.7000 DYNAMIC Gi0/5
200 a1bb.1c60.7000 DYNAMIC Gi0/6
300 aa0b.cc70.7000 DYNAMIC Gi0/7
The first example is a regex without groups. In this case findall
returns
a list of strings that matches with regex.
For example, with findall
you can get a list of matching strings
with vlan - mac – interface
and get rid of header in the output of command:
In [4]: re.findall(r'\d+ +\S+ +\w+ +\S+', mac_address_table)
Out[4]:
['100 a1b2.ac10.7000 DYNAMIC Gi0/1',
'200 a0d4.cb20.7000 DYNAMIC Gi0/2',
'300 acb4.cd30.7000 DYNAMIC Gi0/3',
'100 a2bb.ec40.7000 DYNAMIC Gi0/4',
'500 aa4b.c550.7000 DYNAMIC Gi0/5',
'200 a1bb.1c60.7000 DYNAMIC Gi0/6',
'300 aa0b.cc70.7000 DYNAMIC Gi0/7']
Note that findall
returns a list of strings, not a Match object.
As soon as a group appears in regex, findall
behaves differently. If one
group is used in the expression, findall
returns a list of strings that
matches with expression in the group:
In [5]: re.findall(r'\d+ +(\S+) +\w+ +\S+', mac_address_table)
Out[5]:
['a1b2.ac10.7000',
'a0d4.cb20.7000',
'acb4.cd30.7000',
'a2bb.ec40.7000',
'aa4b.c550.7000',
'a1bb.1c60.7000',
'aa0b.cc70.7000']
findall
searches for a match of the entire string but returns a result
similar to group
method in Match object.
If there are several groups, findall
will return the list of tuples:
In [6]: re.findall(r'(\d+) +(\S+) +\w+ +(\S+)', mac_address_table)
Out[6]:
[('100', 'a1b2.ac10.7000', 'Gi0/1'),
('200', 'a0d4.cb20.7000', 'Gi0/2'),
('300', 'acb4.cd30.7000', 'Gi0/3'),
('100', 'a2bb.ec40.7000', 'Gi0/4'),
('500', 'aa4b.c550.7000', 'Gi0/5'),
('200', 'a1bb.1c60.7000', 'Gi0/6'),
('300', 'aa0b.cc70.7000', 'Gi0/7')]
If such features of findall
function prevent you from getting the needed
result, it is better to use finditer
function, but sometimes this
behavior is appropriate and convenient to use.
An example of using findall
in a log file parsing (parse_log_findall.py file):
import re
regex = (r'Host \S+ '
r'in vlan (\d+) '
r'is flapping between port '
r'(\S+) and port (\S+)')
ports = set()
with open('log.txt') as f:
result = re.findall(regex, f.read())
for vlan, port1, port2 in result:
ports.add(port1)
ports.add(port2)
print('Loop between ports {} in VLAN {}'.format(', '.join(ports), vlan))
The result is:
$ python parse_log_findall.py
Loop between ports Gi0/19, Gi0/16, Gi0/24 в VLAN 10