Professional Documents
Culture Documents
yahoo)
http://nfabo.cn
rockeet@163.com
2014-07-15
//qq 18016168
4
11
www.eeqee.com
eeqee_com
20115
DFA
DFA
AC
DFA
API
DFA NFA
DFA NFA
Trie: ADFA
MinADFA:
MinADFA
Regular Expression
Lexical Analyzing
Pattern Matching
(AC)
Dictionary Compressing
DFA &
DFA
DFA
DFA
DFADFA
DFA
DFA
DFA
0~63
Tree
DFA (Trie)
DAG
DFA (DAWG)
0~62
Tree
DFA (Trie)
DAG
DFA (DAWG)
1 ~ 99999
O(log(n))
int
int_t
intmax_t
int8_t
int16_t
int32_t
int64_t
uint
uint_t
uintmax_t
uint8_t
uint16_t
uint32_t
uint64_t
Tree
DFA (Trie)
DAG
DFA (DAWG)
DFA
Hopcroft
MyhillNerode
() p, q
p, q MyhillNerode
w(p,w) (q,w)
Partition Refinement
Hopcroft
P := {F, Q \ F }; // \ QF
W := {F, Q \ F }; //
// Q \ F W (WaitingSet)
// W { min(F, Q \ F) },
while (W is not empty) do
choose and remove a set A from W
for each c in do
let X be the set of states for which a transition on c leads to a state in A
for each set Y in P for which X Y is nonempty do
replace Y in P by the two sets X Y and Y \ X
if Y is in W
replace Y in W by the same two sets
X
Y
else
add min( X Y, Y \ X ) to W
Hopcroft
DFANFA
Trie
smallmap
O(1)
(permutation)
Waiting Set
Waiting Set
ADFA
ADFA
State Register
Online
map<IsFinal+TargetSet, StateID>
TargetSet StateID pair(Char,Target)
Keymap
Offline
DFA ()DFA
HopcroftO(n)
ADFA
Online ADFA
DFA
/
/(path through)
Offline ADFA
graph-post-order-walk
ADFA Online
1.
2.
3.
4.
CommonPrefix
CommonPrefixLen
State Register
DFA
ADFA
Confluence State
DAWG: ADFA +
Map<string, Data>
ADFA Set<string>
ADFA ADFA
()ADFA
map<Key,Value> Value
DAWG ()
A:
B: ()
0
2
2 A
DFA Map
key \t value
delim key value
key
delim [0, 256), key
=[0,257) delim=256
key
30%
DFA
CPU Cache
DAWG
DFA
DFA
MinDFA
DFADFA
DFA
ACAho-Corasick
AC Trie
AC Trie
fail link
AC Double Array
AC
AC
DFA
()
typedef unsigned int state_id_t;
typedef unsigned char char_t;
typedef state_id_t automata_t[][256];
Demo
DFA
DFA
DFA
Google RE2
DFA
99.9%
OfflineBFS/DFS
Online
min/max char +
Bitmap+
popcnt ctz
Succinct Reprsentation
Rank-Select
30%
bzip2
Trie
Tree Edge Rank-Select
Non-Tree Edge
Memory Mapping
DFA
Memory Mapping
memcpy
memcpy
mmap
keyonly : strset/dawg
key \t val : map
regex \t data/regex_id
dfa
adfa_build, dawg_build,
regex_build, kvbin_build, ac_build
.,
build
MapReduce
DFA
dot
dot
svg
pdf
png
DFA
DFA
dfa = DFA_Interface::load_from(dfafile);
build
adfa_build
dawg_build
map<string, AnyValue>
kvbin_build (delim=256)
set<string>
delim map<string, set<string>>
nested map<string, map<string, map<.> > >
map<ByteArray,set<ByteArray>>
nested map
ac_build ( AC )
build
regex_build
dfa_union
DFA
pinyin_build
APIDFA_Interface
DFA
DFA_Interface::load_from(filename)
DFA
regex DFA
adfa DFA
dawg dfa
ac dfa ( Double Array )
dfa (filename;, )
#include <febird/automata/dfa_interface.hpp>
Anchor
: map<url, AnchorSet>
url AnchorSet
C++11
: map<word, SynonymSet>
(pinyin_build)
P1*P2**Pn
SLCF
: n -> O(log(log(n)))
ADFA : n -> O(log(n))
SLCF NP
: http://nfab.cn ()
: C++11
&
C++11C++98
4
11
www.eeqee.com
eeqee_com
20115
44