Findall function#
Function findall:
is used to search for all non-overlapping matches in string
returns:
list of strings that are described by regex if there are no groups in regex
list of strings that match with regex in the group if there is only one group in regex
list of tuples containing strings that matches with expression in the group if there are more than one group
Consider the work of findall with an example of ‘sh mac address-table output’:
In [2]: mac_address_table = open('CAM_table.txt').read()
In [3]: print(mac_address_table)
sw1#sh mac address-table
Mac Address Table
-------------------------------------------
Vlan Mac Address Type Ports
---- ----------- -------- -----
100 a1b2.ac10.7000 DYNAMIC Gi0/1
200 a0d4.cb20.7000 DYNAMIC Gi0/2
300 acb4.cd30.7000 DYNAMIC Gi0/3
100 a2bb.ec40.7000 DYNAMIC Gi0/4
500 aa4b.c550.7000 DYNAMIC Gi0/5
200 a1bb.1c60.7000 DYNAMIC Gi0/6
300 aa0b.cc70.7000 DYNAMIC Gi0/7
The first example is a regex without groups. In this case findall returns
a list of strings that matches with regex.
For example, with findall you can get a list of matching strings
with vlan - mac – interface and get rid of header in the output of command:
In [4]: re.findall(r'\d+ +\S+ +\w+ +\S+', mac_address_table)
Out[4]:
['100 a1b2.ac10.7000 DYNAMIC Gi0/1',
'200 a0d4.cb20.7000 DYNAMIC Gi0/2',
'300 acb4.cd30.7000 DYNAMIC Gi0/3',
'100 a2bb.ec40.7000 DYNAMIC Gi0/4',
'500 aa4b.c550.7000 DYNAMIC Gi0/5',
'200 a1bb.1c60.7000 DYNAMIC Gi0/6',
'300 aa0b.cc70.7000 DYNAMIC Gi0/7']
Note that findall returns a list of strings, not a Match object.
As soon as a group appears in regex, findall behaves differently. If one
group is used in the expression, findall returns a list of strings that
matches with expression in the group:
In [5]: re.findall(r'\d+ +(\S+) +\w+ +\S+', mac_address_table)
Out[5]:
['a1b2.ac10.7000',
'a0d4.cb20.7000',
'acb4.cd30.7000',
'a2bb.ec40.7000',
'aa4b.c550.7000',
'a1bb.1c60.7000',
'aa0b.cc70.7000']
findall searches for a match of the entire string but returns a result
similar to group method in Match object.
If there are several groups, findall will return the list of tuples:
In [6]: re.findall(r'(\d+) +(\S+) +\w+ +(\S+)', mac_address_table)
Out[6]:
[('100', 'a1b2.ac10.7000', 'Gi0/1'),
('200', 'a0d4.cb20.7000', 'Gi0/2'),
('300', 'acb4.cd30.7000', 'Gi0/3'),
('100', 'a2bb.ec40.7000', 'Gi0/4'),
('500', 'aa4b.c550.7000', 'Gi0/5'),
('200', 'a1bb.1c60.7000', 'Gi0/6'),
('300', 'aa0b.cc70.7000', 'Gi0/7')]
If such features of findall function prevent you from getting the needed
result, it is better to use finditer function, but sometimes this
behavior is appropriate and convenient to use.
An example of using findall in a log file parsing (parse_log_findall.py file):
import re
regex = (r'Host \S+ '
r'in vlan (\d+) '
r'is flapping between port '
r'(\S+) and port (\S+)')
ports = set()
with open('log.txt') as f:
result = re.findall(regex, f.read())
for vlan, port1, port2 in result:
ports.add(port1)
ports.add(port2)
print('Loop between ports {} in VLAN {}'.format(', '.join(ports), vlan))
The result is:
$ python parse_log_findall.py
Loop between ports Gi0/19, Gi0/16, Gi0/24 в VLAN 10