您的当前位置：首页正则表达三剑客

正则表达三剑客

来源：意榕旅游网

一、正则表达式

3w1h
PS：使用grep -o或者egrep -o可以很好理解筛选过程。

1、什么是正则表达式？

1.1作用和特殊字符一样

作用和特殊字符–样。正则表达式是为处理大量的字符串及文本而定义的一套规则和方法。

开发者
假设"@“代表“I an”，”!"代表 “oldboy” ,
则执行echo "@!“的结果就是输出“I am oldboy”。

1.2Linux三剑客的正则表达式特点

* 为处理大量文本及字符串而定义的一套规则和方法。

* 其工作时以行为单位进行，即一次处理一行。

* 通过正则表达式可以将复杂的处理任务化繁为简，提高操作Linux的效率。

* 仅被三剑客(grep/egrep. sed、 awk) 命令支持，其他命令无法使用。

2、提高效率，快速获取到想要的内容

3、适用于三剑客命令 grep(egrep),sed,awk

4、怎么用（实践）

1）环境准备

export LC_ALL=C

2）正则表达式的分类Linux三剑客的正则表达式分为两类，即:
基本正则表达式(BRE, basic regular expression)。
BRE对应的元字符有“^$.[]*”，适用于grep

扩展正则表达式(ERE, extended regular expression)。
ERE在BRE的基础上增加了“O{}?+|” 等字符。适用于egrep

4.1基本正则

4.1.1、^ 尖角号---------以…开头

^oldboy---------以oldboy开头

[root@CCTV ~/test]#grep "^I" oldboy.txt //过滤以I开头的内容
I am oldboy teacher！
I teach linux.
I like badminton ball ，billiard ball and chinese chess！

[root@CCTV ~/test]#ls -l /data|grep "^d"//过滤d开头的
drwxr-xr-x. 2 root root 22 Jul 11 12:15 oldboy

4.1.2、$ 美元符号以…结尾

oldboy$---------以oldboy结尾**

[root@CCTV ~/test]#ls -lF /data //F,给目录加反斜线
-rw-r--r--. 1 root root 0 Jul 15 23:09 01.txt 
-rw-r--r--. 1 root root 0 Jul 15 23:09 02.txt 
drwxr-xr-x. 2 root root 22 Jul 11 12:15 oldboy/ 
-rw-r--r--. 1 root root 151 Jul 1 10:13 oldboy.tar.gz 
-rw-r--r--. 1 root root 13 Jul 11 11:31 oldboy_hard_link 
lrwxrwxrwx. 1 root root 10 Jul 11 11:37 oldboy_soft_link -> oldboy.txt lrwxrwxrwx. 1 root root 6 Jul 11 12:16 oldboy_soft_link_dir -> oldboy/ 

[root@CCTV ~/test]#ls -lF /data|grep "/$"    //过滤以/结尾的内容
drwxr-xr-x. 2 root root 22 Jul 11 12:15 oldboy/ 
lrwxrwxrwx. 1 root root 6 Jul 11 12:16 oldboy_soft_link_dir -> oldboy/

4.1.3、^$ 空行

[root@CCTV ~/test]#grep "^$" oldboy.txt 
//过滤空行，没有意义

[root@CCTV ~/test]#grep -v "^$" oldboy.txt 
// -v取反

[root@CCTV ~/test]#grep -nv "^$" oldboy.txt 
// -v取反，-n显示行号（原文件行号）

4.1.4、. 匹配任意一个且只有一个字符，和通配符？一样

4.1.5、\ 让有意义的字符脱掉马甲还原本意

4.1.6、* 重复前面字符0次或者多次

[root@CCTV /data]#grep "0*" oldboy.txt  //*号表示匹配前一个字符0次或0次以上 
> I am oldboy teacher！//匹配0，0次 
> I teach linux. 
> 
> I like badminton ball ，billiard ball and chinese chess！ 
> our site is http://www.oldboyedu.com 
> my qq num is 49000448. 
> not 4900000448.
> my god ，i am not oldbey，but OLDBOY！

4.1.7、.* 表示所有内容（对任意字符反复多次=所有），包含空行

[root@CCTV /data]#grep ".*" oldboy.txt> I am oldboy teacher！
I teach linux.
I like badminton ball ，billiard ball and chinese chess！
our site is http://www.oldboyedu.com
my qq num is 49000448.
not 4900000448.
my god ，i am not oldbey，but OLDBOY！

[root@CCTV /data]#grep -n ".*" oldboy.txt1:> I am oldboy teacher！
2:> I teach linux.
3:> I like badminton ball ，billiard ball and chinese chess！
4:> our site is http://www.oldboyedu.com
5:> my qq num is 49000448.
6:> not 4900000448.
7:> my god ，i am not oldbey，but OLDBOY！
[root@CCTV /data]#

[root@CCTV /data]#grep "0.*" oldboy.txt //筛选有“0+后面任意”的行
my qq num is 49000448.
not 4900000448.

[root@CCTV /data]#grep "[!abc]" oldboy.txt //这里感叹号被当成一个字符了
I am oldboy teacher！
I teach linux.
I like badminton ball ，billiard ball and chinese chess！
our site is http://www.oldboyedu.com
my god ，i am not oldbey，but OLDBOY！

4.2扩展正则

4.2.1、+匹配前一个字符1次或者多次(不包括空行)与*相似

4.2.2、[: /]+匹配括号内的:或者/字符1次或者1次以上

4.2.3、？问号作用

[root@CCTV /data]#egrep "lik?" oldboy.txt> I teach linux.
I like badminton ball ，billiard ball and chinese chess！

es? -----匹配 e（匹配0个s） es（匹配1个s）
es*------匹配 e es ess essssss essssssssss

es 和 e?s?
e* 空 e ee eeee eeeeee
s* 空 s ss ssssss ssssssssssssss
es 空 es e s ees essssss
e?s? 空 e s es

============================
e? 空 e
s? 空 s

============================

e+ e ee eee eeee…
e* 空 e ee eeee eeeeees* 空 s ss ssssss ssssssssssssss

4.2.4、|竖线作用，匹配多个

[root@CCTV /data]#egrep “oldboy|like|000” oldboy.txt // egrep=grep -E> I am oldboy teacher！

？
[root@CCTV /data]#egrep "lik?" oldboy.txt> I teach linux.
I like badminton ball ，billiard ball and chinese chess！

es? -----匹配 e（匹配0个s） es（匹配1个s）
es*------匹配e es ess essssss essssssssss

4.2.5、{n，m}匹配次数功能实践

a{n,m}匹配前一个字符最少n次，最多m次
a{n,}匹配一个字符最少n次
a{n}匹配前一个字符正好n次
a{,m}匹配前一个字符最多m次

4.2.6、( )分组过滤被包括起来的东西表示一个整体

另外(的内容可以被后面的\n引用，n为数字，表
示引用第几个括号的内容。

4.2.7、\n引用前面( )小括号里的内容，例如: (aa)\1,匹配aaa

oldboy.txt文本：

I am oldboy teacher！
I teach linux.
I like badminton ball ，billiard ball and chinese chess！
our site is http://www.oldboyedu.com
my qq num is 49000448.
not 4900000448.
my god ，i am not oldbey，but OLDBOY！

[root@CCTV /data]#egrep -o "(l)\1" oldboy.txt
ll
ll
ll
[root@CCTV /data]#egrep -o "(my)\1" oldboy.txt  //文本没有mymy
[root@CCTV /data]#egrep -o "(l)\1" oldboy.txt
ll
ll
ll
[root@CCTV /data]#echo mymy>>oldboy.txt //追加mymy到文本
[root@CCTV /data]#egrep -o "(my)\1" oldboy.txt   #(my)\1==mymy
mymy  //筛选结果

4.3 特殊中括号

P557
不太重要，可以了解
正则表达式中也有一些已经定义好了的，可直接使用的中括号表达式

4.4元字符表达式

4.4.1\b 匹配单词边界

4.4.2\d 匹配单个数字字符

这个表达式需要使用 grep -P 参数才能识别

4.5 sed：流编辑器（Linux三剑客之一）

sed:Stream Editor，流编辑器

sed是操作、过滤和转换文本内容的强大工具

常用功能 : 对文件实现快速增删改查（增加、删除、修改、查询）

查询功能最常用 : 过滤（过滤指定字符串）和取行（取出指定行）。

【语法格式】
sed [选项] [sed内置命令字符] [输入文件]

4.5.1 sed之查

要求：筛选oldgirl.txt内容的第二第三行
head -3 oldgirl.txt|tail -2==sed -n ‘2,3p’ oldgirl.txt

[root@CCTV /data]#cat -n oldgirl.txt     
1 I am oldboy teacher!     
2 I Like badminton ball ,billiard ball and chinese chess!     
3 our site is http: / / www.oldboyedu.com     
4 my qq num is 49000448.

[root@CCTV /data]#head -3 oldgirl.txt|tail -2// 显示oldgirl.txt内容前三行的后两行 
I Like badminton ball ,billiard ball and chinese chess! 
our site is http: / / www.oldboyedu.com

[root@CCTV /data]#sed -n "2,3p" oldgirl.txt //只输出筛选的，取消sed的默认输出 
I Like badminton ball ,billiard ball and chinese chess! 
our site is http: / / www.oldboyedu.com

[root@CCTV /data]#sed -n "2,3p" oldgirl.txt //只输出筛选的，取消sed的默认输出
I Like badminton ball ,billiard ball and chinese chess! 
our site is http: / / www.oldboyedu.com

[root@CCTV /data]#sed  "2,3p" oldgirl.txt  //默认的输出，筛选的也输出 
I am oldboy teacher! I Like badminton ball ,billiard ball and chinese chess!
I Like badminton ball ,billiard ball and chinese chess! 
our site is http: / / www.oldboyedu.com 
our site is http: / / www.oldboyedu.com my qq num is 49000448.

过滤含oldboy字符串的行

[root@CCTV /data]#grep "oldboy" oldgirl.txt 
I am oldboy teacher! 
our site is http: // www.oldboyedu.com 

[root@CCTV /data]#sed -n "/oldboy/" oldgirl.txt //不加p出错 
sed: -e expression #1, char 8: missing command 

[root@CCTV /data]#sed -n "/oldboy/p" oldgirl.txt 
I am oldboy teacher! 
our site is http: // www.oldboyedu.com 

[root@CCTV /data]#sed -n /oldboy/p oldgirl.txt //不用双引号也行 
I am oldboy teacher! 
our site is http: // www.oldboyedu.com 

[root@CCTV /data]#sed -n '/oldboy/p' oldgirl.txt //单引号也可以 
I am oldboy teacher! 
our site is http: // www.oldboyedu.com

4.5.2 sed之删除

sed '/oldboy/d' test.txt 
//test.txt内容的输出进行删除，删除oldboy，源文件保持（按行删除，可以没有引号）

4.5.3 sed之替换修改

将文件中的oldboy字符串全部替换为oldgirl

vim替换：
:%s/oldboy/oldgirl/g 这里的反斜线可以改成#号等等，一样就行

sed替换：
sed 's#想替换什么#用什么替换#g' oldboy.txt  不修改源文件
sed -i 's#oldboy#oldgirl#g' oldboy.txt // 加上i----》修改源文件
注意：修改之前备份检查

支持多次编辑

sed -e 's#oldboy#oldgirl#g'  -e 's#like#not#g'  oldboy.txt

sed更多了解：https://blog.oldboyedu.com/commands-sed/

4.5.4 sed之追加文本

在oldboy.txt 文件的的第2行后追加文本“I teach Linux.”
sed -i '2a I teach Linux.' oldboy.txt // a表示追加

扩展

ifconfig eth0|sed -n 2p|sed 's#^.*inet ##g'|sed 's#  netm.*##g'

因篇幅问题不能全部显示，请点此查看更多更全内容

查看全文

正则表达 三剑客