常用的匹配选项:
https://github.com/elastic/logstash/blob/v1.4.2/patterns/grok-patterns
可以在线Debug:
http://grokdebug.herokuapp.com/
用oniguruma正则表达库:
对于东亚文字支持比较好,开始是用在ruby上,也可用于C++,是日本的开发人员编写的。
http://www.geocities.jp/kosako3/oniguruma/doc/RE.txt
内容: - 2015-04-29 13:04:23,733 [main] INFO (api.batch.ThreadPoolWorker) Command-line options for this run: 正则:- %{TIMESTAMP_ISO8601:time} \[%{WORD:main}\] %{LOGLEVEL:loglevel} \(%{JAVACLASS:class}\) %{GREEDYDATA:mydata}
结果:
{ "time": [ "2015-04-29 13:04:23,733" ], "main": [ "main" ], "loglevel": [ "INFO" ], "class": [ "api.batch.ThreadPoolWorker" ], "mydata": [ "Command-line options for this run:" ] }
内容:/wls/applogs/rtlog/icore-pamsDRServer1351/icore-pamsDRServer1351.out 正则:/wls/applogs/rtlog/(?(?[a-zA-Z-]+)([0-9]*(?:SF)|(?:WII)|(?:DMZ)|(?:DR))([0-9a-zA-Z]+))%{UNIXPATH:filepath}