You are on page 1of 79

C++ String Toolkit (StrTk) Tokenizer

By Arash Partow, 23 Jun 2013 4.90 (131 votes)

Download source 100!B.

Introduction
"h#s art#cle w#ll $resent the to%en#&#n' and s$l#tt#n' (unct#onal#ty o( a s#)$le *++ l#,rary called the -tr#n' "ool%#t. "o%en#&at#on #n the conte.t o( str#n' $rocess#n', #s the )ethod ,y wh#ch a se/uence o( ele)ents are ,ro%en u$ or (ra')ented #n su, se/uences called to%ens. "he #nd#ces #n the or#'#nal se/uence that deter)#ne such ,rea%s #n the se/uence are %nown as del#)#ters. "here are two ty$es o( del#)#ters, nor)al or th#n del#)#ters wh#ch are o( len'th one ele)ent and th#c% del#)#ters wh#ch are o( len'th two or )ore ele)ents. 0ven thou'h to%en#&at#on #s $r#)ar#ly used #n con1unct#on w#th str#n's, any se/uence o( ty$es that can ,e #terated #n a l#near (ash#on can ,e to%en#&ed, e.a)$les )ay ,e l#st o( #nte'ers, a vector o( $erson classes or a )a$ o( str#n's.

Another Tokenizer?
"o date there have ,een )any atte)$ts to de(#ne a "standard" "o%en#&er #)$le)entat#on #n *++. 2( the) all the l#%ely cand#date )#'ht ,e the #)$le)entat#on #n the Boost l#,rary. 3e'ardless $ro$osed #)$le)entat#ons should to so)e e.tent cons#der one or )ore o( the (ollow#n' areas4 over all usa'e $atterns, construct#ons, 'eneral#ty (or is it genericty these days?) o( des#'n, $er(or)ance e((#c#ency.

1. Over-all usage atterns


"h#s re/u#re)ent #s concerned w#th how easy #t #s to #nstant#ate the to%en#&er and #nte'rate #t #nto e.#st#n' $rocess#n' $atterns, wh#ch )ost o(ten than not re/u#res #nte'rat#on w#th *++ -"5 al'or#th)s and conta#ners. A to%en&#er ,y de(#n#t#on would ,e classed as a $roducer, so the /uest#on ,eco)es how easy #s #t (or others to consu)e #ts out$ut6 Another #ssue #s cons#stency o( the de(#n#t#on o( a to%en #n the s#tuat#on where one has consecut#ve del#)#ters ,ut #s not co)$ress#n' the) can or should there ,e such a th#n' as an e)$ty to%en6 and what do $reced#n' and tra#l#n' del#)#ters )ean6 and when should they ,e #ncluded as $art o( the to%en6

!. Constructions
7n the conte.t o( str#n' to%en#&#n', the )a1or#ty o( #)$le)entat#ons return the to%en as a new #nstance o( a str#n'. "h#s $rocess re/u#res a str#n' to ,e created on hea$, $o$ulated ,y the su, str#n' #n /uest#on (ro) the or#'#nal se/uence, then returned ,ac% (some of this may be alleviated by Return Value Optimization RVO). 7n the case o( #terators th#s #s essent#ally another co$y to the caller. 8urther)ore two %#nds o( to%ens can )a%e th#s s#tuat#on worse, they are $r#)ar#ly a lar'e se/uence )ade u$ o( lots o( very short to%ens or a lar'e se/uence

)ade u$ o( lots o( very lar'e to%ens. "he solut#on #s not to return the to%en as a str#n' ,ut rather as a ran'e and allow the caller to dec#de how they w#sh to #ns$ect and (urther )an#$ulate the to%en.

"h#s )#nor chan'e #n #nter(ace des#'n $rov#des a 'reat deal o( (le.#,#l#ty and $er(or)ance 'a#n.

". #eneralit$(#enericit$) o% design


9ost to%en#&er #)$le)entat#ons concern the)selves only w#th str#n's, wh#ch (or the )ost $art #s o%, ,ecause )ost th#n's that need to%en#&#n' are str#n's. :owever there w#ll ,e t#)es when one has a se/uence o( ty$es that )ay need to ,e to%en#&ed that aren;t str#n's, hence a to%en#&er should ,e des#'ned #n such a way to ena,le such (eatures, )oreover #t ,eco)es clear that the )ost ,as#c o( to%en#&#n' al'or#th)s are #nvar#ant to the ty$e o( the del#)#ter.

&. 'er%or(ance and )%%icienc$


"o%en#&#n' str#n's ran'e (ro) low (re/uency #n$uts such as user #n$ut or $ars#n' o( s#)$le con(#'urat#on (#les to )ore co)$le. s#tuat#ons such as to%en#&#n' o( :"95 $a'es that a we, crawler<#nde.er )#'ht do, to $ars#n' o( lar'e )ar%et data strea)s #n 87= (or)at. Per(or)ance #s 'enerally #)$ortant and can usually ,e hel$ed alon' w#th 'ood usa'e $atterns that encoura'e reuse o( #nstances, )#n#)al $re$rocess#n' and allow (or user su$$l#ed $red#cates (or the )ore nasty areas o( the $rocess. 7t should ,e noted that everyth#n' #n the $roceed#n' art#cle can ,e done ,y $urely us#n' the -"5 that sa#d, *++;s a,#l#ty to allow one to skin the $rover,#al cat #n nu)erous way '#ves r#se to novel solut#ons that are (or the )ost $art not o( any $ract#cal use other than to showcase ones a,#l#t#es #n us#n' the -"5.

#etting started
"he -tr#n' "ool%#t 5#,rary (-tr"%) $rov#des two co))on to%en#&at#on conce$ts, a s$l#t (unct#on and a to%en #terator. Both these conce$ts re/u#re the user to $rov#de a del#)#ter $red#cate and an #terator ran'e over wh#ch the $rocess w#ll ,e carr#ed out. "he to%en#&at#on $rocess can ,e (urther $ara)etr#&ed ,y sw#tch#n' ,etween >co)$ressed del#)#ters> or >no co)$ressed del#)#ters> )ode. "h#s essent#ally )eans that consecut#ve del#)#ters w#ll ,e co)$ressed down to one and treated as such. Below are two ta,les de$#ct#n' the e.$ected to%ens (ro) var#ous ty$es o( #n$ut. "he ta,les re$resent no co)$ressed

and co)$ressed to%en#&at#on $rocesses res$ect#vely. "he del#)#ter #n th#s #nstance #s a $#$e sy),ol ? and @A denotes an e)$ty to%en.

*o Co( ressed +eli(iters


7n$ut
a a|b a||b |a a| |a||b ||a||b|| | || |||

"o%en 5#st
a a,b a,<>,b <>,a a,<> <>,a,<>,b <>,<>,a,<>,b,<>,<> <>,<> <>,<>,<> <>,<>,<>,<>

Co( ressed +eli(iters


7n$ut
a a||b |a||b ||a||b|| | || |||

"o%en 5#st
a a,b <>,a,b <>,a,b,<> <>,<> <>,<> <>,<>

+eli(iters
"wo (or)s o( del#)#ters are su$$orted and they are s#n'le del#)#ter $red#cate and )ult#$le del#)#ters $red#cate otherw#se %nown as S+' and ,+' res$ect#vely. 0ssent#ally an -DP #s where there #s only one ty$e that can ,rea% or (ra')ent the se/uence, where as w#th 9DPs there #s )ore than one un#/ue ty$e that can ,rea% the se/uence. 7t #s $oss#,le to re$resent a -DP us#n' the 9DP, however (ro) a $er(or)ance P2B hav#n' se$arate $red#cates #s (ar )ore e((#c#ent. Add#t#onally (or str#n's ,ased on char or uns#'ned char (C ,#t vers#ons) there #s a 9DP that has a loo% u$ co)$le.#ty o( 2(1) )a%#n' #t 'reatly )ore e((#c#ent than the ,as#c 9DP.

Single +eli(iter 'redicate


7nstant#at#on re/u#res s$ec#al#&at#on o( ty$e and construct#on re/u#res an #nstance o( the ty$e.
strtk::single_delimiter_predicate<typename T>(const T& t) strtk::single_delimiter_predicate<std::string::value_type> predicate('|')

,ulti le +eli(iter 'redicate


7nstant#at#on re/u#res s$ec#al#&at#on o( ty$e and construct#on re/u#res a se/uence o( $otent#al del#)#ters throu'h a ran'e descr#,ed ,y #terators.

strtk::multiple_delimiter_predicate<typename T>(!terator begin, !terator end) std::string str_delimiters " # ,$ :<>'%&'(()_)*'+,-./012&3|4_5#"6# strtk::multiple_delimiter_predicate<c7ar> mdp8(str_delimiters$begin(),str_delimiters$end()) unsigned int uint_delimiters%9& " '8,8:,8:8,8:8:,8:8:8( strtk::multiple_delimiter_predicate<unsigned int> mdp;(uint_delimiters,uint_delimiters 6 9)

As $rev#ously )ent#oned to%en#&at#on o( data need not ,e l#)#ted to str#n's co)$r#sed o( chars, ,ut can also ,e e.tended to other P2Ds or co)$le. ty$es. 7n the a,ove e.a)$le a $red#cate used (or to%en#&#n' a se/uence o( uns#'ned #nts #s ,e#n' de(#ned.

,ulti le Char +eli(iter 'redicate


7nstant#at#on re/u#res an #terator ran'e ,ased on e#ther uns#'ned char or char. "h#s del#)#ter #s )ore e((#c#ent than the s#)$le 9DP as #t has a $red#cate evaluat#on o( O(1) due to the use o( a loo%u$ ta,le as o$$osed to O(n) where n #s the nu),er o( del#)#ters. Per(or)ance #ncrease #s seen as )ore del#)#ters are used.
strtk::multiple_c7ar_delimiter_predicate(const std::string&) strtk::multiple_c7ar_delimiter_predicate(const std::string::const_iterator begin,const std::string::const_iterator end) strtk::multiple_c7ar_delimiter_predicate predicate(# $ )#)

"he del#)#ter conce$t can ,e e.tended to the $o#nt where the $red#cate #tsel( can act as a state )ach#ne trans#t#on#n' (ro) state to state ,ased on cond#t#ons and rules related to the current sy),ol ,e#n' $rocessed. "h#s s#)$le e.tens#on can lead to so)e very #nterest#n' $ars#n' ca$a,#l#t#es.

S lit
"h#s #s a (unct#on that $er(or)s to%en#&at#on over an ent#re se/uence #n one 'o. strtk::split ta%es a se/uence throu'h a ran'e o( #terators or #n the case o( a str#n' throu'h a re(erence to #ts #nstance, a del#)#ter $red#cate and an out$ut #terator or otherw#se %nown as a ty$e s#n%. 7t $o$ulates the out$ut #terator w#th the to%ens #t e.tracts. "he to%ens #n th#s case are std44$a#rs o( #terators (or the se/uence.

-$l#t can ,e used #n a "simple - no frills" )anner ,y s#)$ly $ass#n' the necessary $ara)eters4
std::string str " #abc|8;<|=y>|?@A# strtk::std_string::token_list_type token_list strtk::split(# |$ )#,str,std::back_inserter(token_list))

can also ,e used #n a )ore e.$l#c#t )anner where,y the e.act ty$e o( del#)#ter $red#cate can ,e s$ec#(#ed ,y the user4
strtk::split **split using strtk predicates ' std::string str " #abc|8;<|=y>|?@A# strtk::std_string::token_list_type token_list strtk::single_delimiter_predicate<std::string::value_type> predicate('|') strtk::split(predicate,str,std::back_inserter(token_list)) ( **split using a lambda as a predicate ' std::string data " #abc|8;<|=y>|?@A# std::deBue<std::string> token_list strtk::split(%&(const c7ar c) ' return '|' "" c (, data, strtk::range_to_type_back_inserter(token_list)) (

$rov#des an add#t#onal usa'e o$t#on that allows the user to s$ec#(y #( they would l#%e to e#ther co)$ress the del#)#ters and whether or not they would l#%e to #nclude the del#)#ter as $art o( the to%en ran'e. "h#s enu) $ara)eter #s called strtk::split_options and has the (ollow#n' values4
strtk::split

-$l#t 2$t#on

De(#n#t#on strtk::split_options::deCault_mode De(ault o$t#ons *onsecut#ve del#)#ters are treated as strtk::split_options::compress_delimiters one "he (#rst del#)#ter #s #ncluded #n the strtk::split_options::include_8st_delimiter result#n' to%en ran'e All del#)#ters are #ncluded #n the strtk::split_options::include_delimiters result#n' to%en ran'e "he s#)$le e.a)$le ,elow de)onstrates a s$l#t that w#ll occur over a str#n' '#ven a $red#cate where the $rov#ded s$l#t o$t#ons #nd#cate that consecut#ve del#)#ters w#ll ,e treated as one and also all del#)#ters encountered a(ter each to%en w#ll also ,e #ncluded #n the to%en u$ unt#l the ne.t to%en ,e'#ns.
strtk::split(predicate, str, std::back_inserter(token_list), strtk::split_options::compress_delimiters | strtk::split_options::include_delimiters)

Another way o( us#n' s$l#t )ay ,e to use the


strtk::multiple_c7ar_delimiter_predicate

as (ollows4

std::string str " #abc)8;< =y>$?@A# strtk::std_string::token_list_type token_list strtk::multiple_c7ar_delimiter_predicate predicate(# $ )#) strtk::split(predicate,str,std::back_inserter(token_list))

"he contents o( the token_list can ,e $r#nted out as (ollows4


strtk::std_string::token_list_type::iterator itr " token_list$begin() D7ile (token_list$end() -" itr) ' std::cout << (3itr) << '5t' 66itr (

S lit *-Tokens
A natural e.tens#on o( strtk::split #s strtk::split_n. "h#s (unct#on $rov#des the a,#l#ty to to%en#&e a se/uence u$ unt#l a s$ec#(#c nu),er o( to%ens have ,een encountered or unt#l there are no )ore to%ens le(t. "he return value o( the strtk::split_n would ,e the nu),er o( to%ens encountered.
std::string data " #token8)token;,token< tokenE,token9# strtk::std_string::token_list_type token_list const std::si>e_t token_count " E const std::string delimiters (# ,$ )#) strtk::split_n(delimiters, data, token_count, std::back_inserter(token_list)) strtk::std_string::token_list_type::iterator itr " token_list$begin()

D7ile (token_list$end() -" itr) ' std::cout << #%# << (3itr) << #&5t# 66itr ( std::cout << std::endl

O%%set S litter
Another s#)$le var#ant #s the strtk::oCCset_splitter. "h#s (or) o( s$l#t ta%es a ser#es o( o((sets and an #terator ran'e or str#n' and deter)#nes the to%ens ,ased on the o((sets. "h#s (unct#on can ,e set to $er(or) a s#n'le $ass o( the o((sets or to rotate the) unt#l the ran'e has ,een co)$letely traversed. "he e.a)$le ,elow de)onstrates how a str#n' re$resent#n' date and t#)e can ,e to%en#&ed #nto #ts const#tuent $arts (year, )onth, day, hour, )#nutes ,seconds,)#ll#seconds)
std::string time_data " #;::::8:8:A8:88<E9# const std::si>e_t oCCset_list_si>e " ? const int oCCset_list%oCCset_list_si>e& " ' **;:::4:84:8 A:8::88sec <E9ms E, ;, ;, ;, ;, ;, < ** ** ** ** ** ** ** year mont7 day 7our minute second ms

( const strtk::oCCset_predicate<oCCset_list_si>e> predicate(oCCset_list) strtk::std_string::token_list_type token_list strtk::oCCset_splitter(time_data,predicate,std::back_inserter(token_list)) strtk::std_string::token_list_type::iterator itr " token_list$begin() D7ile (token_list$end() -" itr) ' std::cout << #%# << (3itr) << #& # 66itr ( std::cout << std::endl

S lit -ege.
Another (or) o( s$l#tter #s ,ased on the conce$t o( us#n' re'ular e.$ress#ons as the del#)#ter $red#cate. Below #s a s#)$le e.a)$le o( s$l#tt#n' to%ens wra$$ed #n round ,rac%ets.
std::string str " #(8;)(<E9)(F?@A)(:iGk=)(y>)# std::list<std::string> token_list strtk::split_rege=(#55($3)55)#, s, std::back_inserter(token_list), strtk::rege=_matc7_mode::matc7_8) std::list<std::string>::iterator itr " token_list$begin() D7ile (token_list$end() -" itr) ' std::cout << #%# << (3itr) << #&5t# 66itr

( std::cout << std::endl

Dote4 "he $ara)eter re'e.E)atchE)ode re$resents the ca$ture o( the )ar%ed su, e.$ress#on #n the current )atch. By de(ault #t #s strtk::rege=_matc7_mode::matc7_all wh#ch #n th#s case would $rov#de the ent#re )atch #nclud#n' the ,ound#n' $attern, e'4 "o%en3 would ,e (0#1%.). :owever #n the a,ove e.a)$le we are only #nterested #n the su, e.$ress#on ,etween the round ,rac%ets, hence strtk::rege=_matc7_mode::matc7_8 #s used result#n' #n "o%en3 ,e#n' 0#1%.. "he (ollow#n' e.a)$les de)onstrate the use o( strtk::split_rege= and strtk::split_rege=_n rout#nes #n e.tract#n' s$ec#(#c ty$es o( data #n th#s case the P2D ty$es #nt and dou,le.
int main() ' ' ** H=tract ints Crom data string std::string data " #a 82bc,::;<| deC)g7(E9F?iGk)4@A 8:l,m0n4 op68;<r.st6<u v3D;y9Fy>6# std::deBue<int> int_list strtk::split_rege=(#(%64&)(%55d&6))#, data, strtk::range_to_type_back_inserter(int_list), strtk::rege=_matc7_mode::matc7_8) ( ** H=tract doubles Crom data string std::string data " #ab0c8$8)d4;$;eC/g7i6<$<1(8;<$E9F)-&34 ?$@AH68;.2"# std::deBue<double> double_list strtk::split_rege=(strtk::ieee?9E_e=pression, data, strtk::range_to_type_back_inserter(double_list), strtk::rege=_matc7_mode::matc7_8) ( ' ** H=tract t7e Cirst < ints Crom data string std::string data " #a 82bc,::;<| deC)g7(E9F?iGk)4@A 8:l,m0n4 op68;<r.st6<u v3D;y9Fy>6# std::deBue<int> int_list strtk::split_rege=_n(#(%64&)(%55d&6))#, data, <, strtk::range_to_type_back_inserter(int_list), strtk::rege=_matc7_mode::matc7_8) ( ' ** H=tract t7e Cirst E doubles Crom data string std::string data " #ab0c8$8)d4;$;eC/g7i6<$<1(8;<$E9F)-&34 ?$@AH68;.2"# std::deBue<double> double_list strtk::split_rege=_n(strtk::ieee?9E_e=pression, data, E, '

( return : (

strtk::range_to_type_back_inserter(double_list), strtk::rege=_matc7_mode::matc7_8)

"he (ollow#n' ta,le descr#,es the var#ous re'e. $atterns ,u#lt #nto -tr"% wh#ch can ,e used sea)lessly w#th the strtk::split_rege= and strtk::split_rege=_n rout#nes. 3e'e. De(#n#t#on F37, F35 address e.g.: http://www.e ample.com! strtk::uri_e=pression domain.e ample.net/inde .html strtk::email_e=pression 0 )a#l address e.g.: some.one"e ample.com strtk::ip_e=pression 7Pv4 address e.g.: #$%.#&'.(.#! #%).(.(.# 8loat#n' $o#nt value e.g.: #.#! #.%*+e-#%*! -#.(((#,-#(! strtk::ieee?9E_e=pression (.#%*+

Tokenizer
"he to%en#&er #s s$ec#al#&ed on a se/uence #terator and $red#cate. 7t #s constructed w#th a ran'e o( #terators (or the se/uence and a re(erence to the des#red $red#cate. De(#n#t#ons e.#st (or std44str#n' to%en#&ers. "he to%en#&er $rov#des an #terat#on $attern as a )eans (or access#n' the to%ens #n the se/uence.
const unsigned int data_si>e " 8; unsigned int data%data_si>e& " '8,;,<,:,E,9,F,:,?,@,:,A( strtk::single_delimiter_predicate<unsigned int> predicate(:) typedeC strtk::tokeni>er<unsigned int3,strtk::single_delimiter_predicate<unsigned int> > tokeni>er_type tokeni>er_type tokeni>er(data,data 6 data_si>e,predicate)

-#)#lar to that o( strtk::split, strtk::tokeni>er $rov#des to%en#&#n' o$t#ons that are $assed #n dur#n' construct#on. Below #s a ta,le de$#ct#n' sa#d o$t#ons4 "o%en#&e 2$t#on De(#n#t#on De(ault o$t#ons *onsecut#ve del#)#ters are treated as strtk::tokeni>e_options::compress_delimiters one "he (#rst del#)#ter #s #ncluded #n the strtk::tokeni>e_options::include_8st_delimiter result#n' to%en ran'e All del#)#ters are #ncluded #n the strtk::tokeni>e_options::include_delimiters result#n' to%en ran'e
strtk::tokeni>e_options::deCault_mode typedeC strtk::tokeni>er<unsigned int3,strtk::single_delimiter_predicate<unsigned int> > tokeni>er_type tokeni>er_type tokeni>er(data, data 6 data_si>e, predicate, strtk::tokeni>e_options::compress_delimiters | strtk::tokeni>e_options::include_8st_delimiter)

8urther)ore, 7terat#on over the to%ens o( strtk::tokeni>er #s $er(or)ed as (ollows4

tokeni>er_type::iterator itr " tokeni>er$begin() D7ile (tokeni>er$end() -" itr) ' std::copy((3itr)$Cirst,(3itr)$second,std::ostream_iterator<unsigned int>(std::cout,# #)) std::cout << std::endl 66itr (

A ty$#cal std::string can ,e to%en#&ed #n the (ollow#n' )anner4


std::string str " #abc|8;<|=y>|?@A# strtk::std_string::tokeni>er<>::type tokeni>er(str,#|#) strtk::std_string::tokeni>er<>::type::iterator itr " tokeni>er$begin() D7ile (tokeni>er$end() -" itr) ' std::cout << #%# << (3itr) << #&5t# 66itr ( std::cout << std::endl

Another co))on s#tuat#on )ay ,e to%en#&#n' a se/uence o( str#n's, such as the (ollow#n'4
const std::string str_list%& " ' #abc# , #delimiter# , #iGk# , #delimiter# , #mno# , #delimiter# , #rst# , #uvD# , #delimiter# , #=y># ( const std::si>e_t str_list_si>e " si>eoC(str_list) * si>eoC(std::string) strtk::range_adapter<std::string> range(str_list,str_list_si>e) strtk::single_delimiter_predicate<std::string> predicate(#delimiter#) typedeC strtk::tokeni>er<std::string3,strtk::single_delimiter_predicate<std::string > > tokeni>er_type tokeni>er_type tokeni>er(range$begin(),range$end(),predicate) tokeni>er_type::iterator itr " tokeni>er$begin() D7ile (tokeni>er$end() -" itr) ' std::copy((3itr)$Cirst, (3itr)$second,std::ostream_iterator<std::string>(std::cout,# #)) std::cout << std::endl 66itr (

*ote/ 8or $er(or)ance and e((#c#ent resource )ana'e)ent $ur$oses the strtk::tokeni>er does not ta%e ownersh#$ or )a%e an #nternal co$y o( the se/uence ,e#n' to%en#&ed, as such the strtk::tokeni>er e.$ects the ran'e to ,e val#d dur#n' the ent#rety o( the to%en#&at#on $rocess, th#s #s also the case (or the s$ec#(#ed $red#cate.

The 'arse -outine


"#ll now the )ent#oned rout#nes wor%ed s$ec#(#cally w#th to%ens, or #n other words ran'es o( characters. "he res$ons#,#l#ty o( )ana'#n' the to%ens and convert#n' the to%ens to user s$ec#(#ed ty$es was done )anually v#a >ran'e to ty$e> or#ented ,ac% #nserters and converters. "h#s can ,e a ,#t cu),erso)e and as such -tr"% $rov#des a ser#es o( hel$er rout#nes called

strtk::parse.

Parse ta%es an std::string re$resent#n' a tu$le o( del#)#ted values as #n$ut data, a del#)#ter set, and a ser#es o( re(erences to var#a,les that are to ,e $o$ulated w#th the values (ro) the $arsed to%ens. "he (ollow#n' d#a'ra) de)onstrates the (low o( data, to%ens and the corres$ond#n' relat#onsh#$s and convers#ons ,etween each o( the $ara)eters.

*ote/ strtk::parse returns a ,oolean value o( true u$on success(ul $ars#n' and false (or all other results. -#tuat#ons that cause strtk::parse to (a#l #nclude4

7nsu((#c#ent nu),er o( to%ens (or the '#ven nu),er o( var#a,les *onvers#on (a#lure (ro) to%en ran'e to var#a,le ty$e 0)$ty or null to%en(s)

So(e Si( le 'arse ).a( les


can ta%e an ar,#trary nu),er o( var#a,le re(erences. "he code ,elow de)onstrates the ,as#c usa'e o( strtk::parse ta%#n' var#ous nu),er o( $ara)eters.
strtk::parse std::string data " #abcde,48;<E|9F?$@A:/8$8C# std::string delimiters " #,|/# std::string var: int var8 double var; Cloat var< strtk::parse(data,delimiters,var:) strtk::parse(data,delimiters,var:,var8) strtk::parse(data,delimiters,var:,var8,var;)

strtk::parse(data,delimiters,var:,var8,var;,var<)

"he (ollow#n' e.a)$les de)onstrate $ars#n' o( P2Ds such as #nt and dou,le #nto -"5 co)$at#,le se/uences (std::vector! std::deque! std::list! std::set! std::queue! std::stack and std::priority_queue).
** !nsert into std::vector std::string int_string " #8 6; 4< E 69 4F ? 6@ 4A 8: 688 48; 8< 68E 489# std::vector<int> int_vector strtk::parse(int_string,# #,int_vector) ** !nsert into std::deBue std::string double_string " #48;<$E9F,?@A$:8;,4<E9$F?@,A:8$;<E,69F?$@A:# std::deBue<double> double_deBue strtk::parse(double_string,#,#,double_deBue) ** !nsert into std::list std::string data_string " #a,bc,deC,g7iG,klmno,pBrstu,vD=y># std::list<std::string> string_list strtk::parse(data_string,#,#,string_list) ** !nsert into std::set std::string set_string " #a|bc/deC|g7iG/klmno|pBrstu/vD=y># std::set<std::string> string_set strtk::parse(set_string,#/|#,string_set) ** !nsert into std::Bueue std::string Bueue_string " #value8,value;,value<,valueE,value9# std::Bueue<std::string> string_Bueue strtk::parse(Bueue_string,#,|#,string_Bueue) ** !nsert into std::stack std::string stack_string " #value8|value;,value<|valueE,value9# std::stack<std::string> string_stack strtk::parse(stack_string,#,|#,string_stack) ** !nsert into std::priority_Bueue std::string priority_Bueue_string " #value8|value;,value</valueE,value9# std::priority_Bueue<std::string> string_priority_Bueue strtk::parse(priority_Bueue_string,#,|/#,string_priority_Bueue)

-#)#lar to what #s descr#,ed a,ove, the (ollow#n' de)onstrates $ars#n' o( u$ to >D> ele)ents #nto an -"5 co)$at#,le se/uence.
** !nsert 9 elements into std::vector std::string int_string " #8::,4;::,6<::,E::,49::,6F::,?::,4@::,6A::# std::vector<int> int_vector strtk::parse_n(int_string,#,#,9,int_vector) ** !nsert < elements into std::deBue std::string double_string " #8;<$E9F,6?@A$:8;,<E9$F?@,4A:8$;<E,9F?$@A:# std::deBue<double> double_deBue strtk::parse_n(double_string,#,#,<,double_deBue) ** !nsert F elements into std::list std::string data_string " #a,bc,deC,g7iG,klmno,pBrstu,vD=y># std::list<std::string> string_list

strtk::parse_n(data_string,#,#,F,string_list) ** !nsert F elements into std::set std::string set_string " #a|bc/deC|g7iG/klmno|pBrstu/vD=y># std::set<std::string> string_set strtk::parse_n(set_string,#/|#,F,string_set) ** !nsert E elements into std::Bueue std::string Bueue_string " #value:,value8,value;,value<,valueE,value9# std::Bueue<std::string> string_Bueue strtk::parse_n(Bueue_string,#,|#,E,string_Bueue) ** !nsert E elements into std::stack std::string stack_string " #value:|value8|value;,value<|valueE,value9# std::stack<std::string> string_stack strtk::parse_n(stack_string,#,|#,E,string_stack) ** !nsert E elements into std::priority_Bueue std::string priority_Bueue_string " #value:/value8| value;,value</valueE,value9# std::priority_Bueue<std::string> string_priority_Bueue strtk::parse_n(priority_Bueue_string,#,|/#,E,string_priority_Bueue)

Dote4 "he return value o( the rout#ne strtk::parse_n #nd#cates how )any ele)ents were $arsed and $laced #nto the s$ec#(#ed se/uence.

So(e Initial Si( le ).a( les


Si( le ).a( le 0
As a (#rst e.a)$le, we;ll tac%le the s#)$le $ro,le) o( revers#n' words #n a sentence. "hat #s '#ven a sentence, to have the (#rst word ,e the last and the last to ,e the (#rst, the second word to ,e the second last so on and so (orth. Fs#n' -tr"% we co)e u$ w#th the (ollow#n' solut#on4
int main() ' std::string sentence " #T7e Buick broDn Co= Gumps over t7e la>y dog# std::cout << #IeCore: # << sentence << std::endl strtk::split(# #, sentence, strtk::Cunctional_inserter( %&(const strtk::range::string& range) ' strtk::reverse(range) () ) strtk::reverse(sentence) std::cout << #JCter: # << sentence << std::endl return : (

G#th an e.$ected out$ut o(4


IeCore: T7e Buick broDn Co= Gumps over t7e la>y dog JCter: dog la>y t7e over Gumps Co= broDn Buick T7e

Si( le ).a( le 1

Another e.a)$le, '#ven a l#st o( words to ,lan% out and a sentence, trans(or) the sentence such that the blank-out words are re)oved. Fs#n' -tr"% we co)e u$ w#th the (ollow#n' solut#on4
int main() ' std::string sentence " #T7e Buick broDn Co= Gumps over t7e la>y dog# std::unordered_set<std::string> blankout_Dords blankout_Dords$insert(#Buick#) blankout_Dords$insert(#over#) blankout_Dords$insert(#la>y#) std::cout << #IeCore: # << sentence << std::endl strtk::split(# #, sentence, strtk::Cunctional_inserter( %&blankout_Dords&(const strtk::range::string& range) ' auto itr " blankout_Dords$Cind(range) iC (blankout_Dords$end() -" itr) ' strtk::Cill(range,' ') ( () ) strtk::remove_consecutives_inplace(' ',sentence) std::cout << #JCter: # << sentence << std::endl return : (

G#th an e.$ected out$ut o(4


IeCore: T7e Buick broDn Co= Gumps over t7e la>y dog JCter: T7e broDn Co= Gumps t7e dog

Si( le ).a( le !
Another s#)$le e.a)$le would ,e '#ven a te.t (#le to read each o( the l#nes and $o$ulate a word list structure ,y to%en#&#n' each l#ne #nto words. "he (ollow#n' #s an e.a)$le o( how th#s can ,e ach#eved us#n' -tr"%4
int main() ' std::deBue<std::string> Dord_list strtk::Cor_eac7_line(#te=t$t=t#, %&Dord_list&(const std::string& line) ' static const std::string delimiters " #:8;<E9F?@A()%&'(<># #5t5r5n ,,$ :'# 12&3_4"6+,*# () ( return : #-./0 strtk::parse(line,delimiters,Dord_list)

Si( le ).a( le "


"he (ollow#n' s#)$le e.a)$le ta%es a user s$ec#(#ed te.t (#le, $roceeds to $rocess #t and returns #n(or)at#on relat#n' to the (#le, such as word, letter, u$$ercase character, lowercase character, vowel and consonant counts.
int main() ' std::si>e_t std::si>e_t std::si>e_t std::si>e_t std::si>e_t std::si>e_t

Dord_count letter_count uppercase_count loDercase_count voDel_count consonant_count

" " " " " "

: : : : : :

using namespace strtk Cor_eac7_line(#data$t=t#, %&&(const std::string& line) ' static multiple_c7ar_delimiter_predicate is_voDel(#JH!KLaeiou#) static multiple_c7ar_delimiter_predicate is_loDercase(e=t_string::all_loDercase_letters()) static const std::string delimiters " e=t_string::all_c7ars() 4 e=t_string::all_loDercase_letters() 4 e=t_string::all_uppercase_letters() split(delimiters, line, Cunctional_inserter( %&&(const strtk::range::string& range) ' iC (: "" range$si>e()) return 66Dord_count letter_count 6" range$si>e() std::si>e_t current_loDercase_count " : std::si>e_t current_voDel_count " : Cor (std::si>e_t i " : i < range$si>e() 6 6i) ' iC (is_voDel(range%i&)) 6 6current_voDel_count iC (is_loDercase(range%i&)) 6 6current_loDercase_count ( uppercase_count 6" range$si>e() 4 current_loDercase_count loDercase_count 6" current_loDercase_count consonant_count 6" range$si>e() 4 current_voDel_count voDel_count 6" current_voDel_count () ) () std::cout << #Mord count: std::cout << #Netter count: # << Dord_count # << letter_count << std::endl << std::endl

std::cout std::cout std::cout std::cout return : (

<< << << <<

#Lppercase count: #NoDercase count: #OoDel count: #Ponsonant count:

# # # #

<< << << <<

uppercase_count loDercase_count voDel_count consonant_count

<< << << <<

std::endl std::endl std::endl std::endl

Si( le ).a( le &


8or the ne.t e.a)$le, assu)e we have a te.t (#le w#th a l#st o( na)es, one $er l#ne that re$resents the order o( $eo$le that entered a ,u#ld#n'. -o)e o( the $eo$le )ay enter and leave then reenter the ,u#ld#n' )any t#)es, hence the#r na)e w#ll a$$ear )ult#$le t#)es #n the l#st. 2ur tas% #s to reduce th#s l#st to a l#st o( un#/ue na)es ,ut to also )a#nta#n the relat#ve order o( na)es (ound #n the or#'#nal l#st. "he (ollow#n' #s how th#s $art#cular re/u#re)ent can ,e acco)$l#shed ,y us#n' -tr"%4
int main() ' strtk::Cor_eac7_line(#Cile_name$t=t#, %&(const std::string& line) ' static std::unordered_set<std::string> line_set iC (line_set$end() -" line_set$Cind(line)) return line_set$insert(line) std::cout << line << std::endl () return : (

Si( le ).a( le 1
As a (#nal s#)$le e.a)$le, we would l#%e to calculate the word (re/uency )odel o( a user s$ec#(#ed te.t (#le. "he $rocess #nvolves read#n' each l#ne, s$l#tt#n' the l#ne #nto words, then #ncre)ent#n' the relevant count (or each word and )a#nta#n#n' a 'lo,al word count. 2nce the (#le has ,een $rocessed, the occurrence (re/uency o( each word w#ll ,e $r#nted to stdout.
int main() ' typedeC std::unordered_map<std::string,unsigned int> map_t map_t Dord_7it_list unsigned int Dord_count " : strtk::Cor_eac7_line(#data$t=t#, %&&(const std::string& line) ' static const std::string delimiters " strtk::e=t_string::all_c7ars() 4 strtk::e=t_string::all_loDercase_letters() 4 strtk::e=t_string::all_uppercase_letters() strtk::split(delimiters, line, strtk::Cunctional_inserter( %&&(const strtk::range::string& range) '

iC (range$begin() "" range$end()) return Dord(range$begin(),range$end()) () () ) 66Dord_count std::string 66Dord_7it_list%Dord&

Cor (map_t::value_type v : Dord_7it_list) ' printC(#1s 18:d 18:$AC5n#, strtk::te=t::rig7t_align(89,' ',v$Cirst)$c_str(), v$second, (8$: 3 v$second) * Dord_count) ( return :

A 'ractical ).a( le
5ets assu)e you have ,een '#ven an 0n'l#sh te.t (#le to $rocess, w#th the #ntent#on o( e.tract#n' a le.#con (ro) the (#le. 2ne solut#on would ,e to ,rea% the $ro,le) down to a l#ne ,y l#ne to%en#&at#on $ro,le). 7n th#s case you would de(#ne a (unct#onal o,1ect such as the (ollow#n' wh#ch w#ll ta%e the conta#ner #n wh#ch you $lan on stor#n' your to%ens (words) and a $red#cate and #nsert the to%ens as str#n's #nto your conta#ner.
template<typename Pontainer, typename Qredicate> struct parse_line ' public: parse_line(Pontainer& container, const Qredicate& predicate) : container_(container), predicate_(predicate) '( inline void operator() (const std::string& str) ' strtk::split(str, predicate_, strtk::range_to_type_back_inserter(container_), strtk::split_options::compress_delimiters) ( private: Pontainer& container_ const Qredicate& predicate_ (

"he whole th#n' to'ether would #nclude a $rocess o( o$en#n' the (#le and read#n' #t l#ne ,y l#ne each t#)e #nvo%#n' the parse_line would ,e as (ollows4

template<typename Pontainer> void parse_te=t(const std::string& Cile_name, Pontainer& c) ' static const std::string delimiters " # ,$ :<>'%&'(()_)*# #+,-./012&3|4_5#"65t5r5n5:# #:8;<E9F?@A# strtk::multiple_c7ar_delimiter_predicate predicate(delimiters) strtk::Cor_eac7_line(Cile_name, parse_line<Pontainer,strtk::multiple_c7ar_delimiter_predicate>(c,predicate) ) ( int main() ' std::string te=t_Cile_name " #te=t$t=t# std::deBue<std::string> Dord_list parse_te=t(te=t_Cile_name,Dord_list) std::cout << #Token Pount: # << Dord_list$si>e() << std::endl return : (

Be(ore we cont#nue on w#th the e.a)$le, a re wr#tten vers#on o( the a,ove code us#n' *++11 la),das #s as (ollows4
int main() ' std::string te=t_Cile_name " #te=t$t=t# std::deBue<std::string> Dord_list strtk::Cor_eac7_line(te=t_Cile_name, %&Dord_list&(const std::string& line) ' static const std::string delimiters " # ,$ :<>'%&'(()_)*# #+,-./0 12&3|4_5#"65t5r5n5:# #:8;<E9F?@A# strtk::parse(line,delimiters,Dord_list) () std::cout << #Token Pount: # << Dord_list$si>e() << std::endl return :

Dow co)#n' ,ac% to the or#'#nal $ro,le), that ,e#n' the construct#on o( a le.#con. 7n th#s case the set o( >words> should only conta#n words o( #nterest. 8or the sa%e o( s#)$l#c#ty lets de(#ne words o( #nterest as ,e#n' anyth#n' other than the (ollow#n' $re$os#t#ons4 as, at, ,ut, ,y, (or, #n, l#%e, ne.t, o(, on, o$$os#te, out, $ast, to, u$ and v#a. "h#s ty$e o( l#st #s co))only %nown as a Sto 2ord 3ist. 7n th#s e.a)$le the sto$ word l#st de(#n#t#on w#ll ,e as (ollows4
const std::string stop_Dord_list %& " ' #as#, #at#, #but#, #by#, #Cor#, #in#, #like#, #ne=t#, #oC#, #on#, #opposite#, #out#, #past#, #to#, #up#, #via#, ## (

const std::si>e_t stop_Dord_list_si>e " si>eoC(stop_Dord_list) * si>eoC(std::string)

-o)e )#nor u$dates to the $arseEl#ne $rocessor #nclude us#n' the Cilter_on_matc7 $red#cate (or deter)#n#n' #( the currently $rocessed to%en #s a $re$os#t#on and also the #nvocat#on o( the range_to_type ,ac%E#nserter to convert the to%ens (ro) the#r ran'e #terator re$resentat#on to a ty$e re$resentat#on co)$at#,le w#th the user de(#ned conta#ner. 8or the new #)$le)entat#on to $rov#de un#/ue words o( #nterest the s#)$lest chan'e that can ,e )ade #s to re$lace the de/ue used as the conta#ner (or the wordEl#st to so)e %#nd o( 1 1 assoc#at#ve conta#ner such as a set. "he (ollow#n' #s the #)$roved vers#on o( the parse_line $rocessor4
template<typename Pontainer, typename Qredicate> struct parse_line ' public: parse_line(Pontainer& container, const Qredicate& predicate) : container_(container), predicate_(predicate), tmp_(# #), tokeni>er_(tmp_,predicate_,true), Cilter_(stop_Dord_list,stop_Dord_list 6 stop_Dord_list_si>e, strtk::range_to_string_back_inserter_iterator<Pontainer>(container_), true,Calse) '( inline void operator() (const std::string& s) ' const Cilter_type& Cilter " Cilter_ strtk::Cor_eac7_token(s,tokeni>er_,Cilter) ( private: Pontainer& container_ const Qredicate& predicate_ std::string tmp_ typename strtk::std_string_tokeni>er<Qredicate>::type tokeni>er_ strtk::Cilter_on_matc7<strtk::range_to_string_back_inserter_iterator<Pontai ner>> Cilter_ (

"he a,ove descr#,ed $red#cate can ,e 'reatly s#)$l#(#ed ,y us#n' ,#nders and var#ous la),da e.$ress#ons.

Another ).a( le
Ghen $er(or)#n' ser#al#&at#on or deser#al#&at#on o( an #nstance o,1ect such as a class or struct, a s#)$le a$$roach one could ta%e would ,e to ta%e each o( the )e),ers and convert the) #nto str#n' re$resentat#ons and (ro) those str#n's construct a lar'er str#n' del#)#t#n' each )e),er w#th a s$ec#al character 'uaranteed not to e.#st #n any o( the str#n' re$resentat#ons.

7n th#s e.a)$le we w#ll assu)e that there e.#sts a struct wh#ch re$resents the $ro$ert#es o( a $erson, a $erson struct #( you w#ll4
struct person ' unsigned int id std::string name unsigned int age double 7eig7t Cloat Deig7t (

"he $rocess o( $o$ulat#n' a $erson struct would enta#l hav#n' an #nstance o( a $erson and the necessary data str#n'. "he (ollow#n' #s an e.a)$le o( how th#s would ,e done us#n' the strtk::parse (unct#on.

'erson Tu le 4or(at
"o%en0 Fn#/ue 7D(he.) Del#)#ter0 "o%en1 Del#)#ter1 "o%en2 Del#)#ter2 "o%en3 ? Da)e ? A'e ? Del#)#ter3 "o%en4 Ge#'ht(%')

:e#'ht()) ?

std::string data " #:=RJ<?HS8;|Tumpelstiltskin|<A?|8$<8|9@$?# person p strtk::7e=_to_number_sink<unsigned int> 7e=_sink(p$id) ** register id Dit7 t7e 7e= sink strtk::parse(data,#|#,7e=_sink,p$name,p$age,p$7eig7t,p$Deig7t)

Batch $rocess#n' o( a te.t (#le co)$r#sed o( one $erson tu$le $er l#ne #s so)ewhat s#)#lar to the $rev#ous e.a)$le. A $red#cate #s de(#ned that ta%es a conta#ner s$ec#al#&ed on the $erson struct, and a del#)#ter $red#cate w#th wh#ch the strtk::parse (unct#on w#ll ,e #nvo%ed. "h#s $red#cate #s then #nstant#ated and cou$led w#th the te.t (#le na)e, #s (ed to the strtk::Cor_eac7_line $rocessor.
template<typename Pontainer, typename Qredicate> struct parse_person_tuple ' public: parse_person_tuple(Pontainer& container) : container_(container), 7e=_sink(p_$id) '( inline void operator() (const std::string& s) ' iC (strtk::parse(s,#|#,7e=_sink,p_$name,p_$age,p_$7eig7t,p_$Deig7t)) container_$pus7_back(p_) else std::cerr << #Railed to parse: # << s << std::endl ( private: Pontainer& container_ person p_ strtk::7e=_to_number_sink<unsigned int> 7e=_sink

Br#n'#n' the a,ove $#eces to'ether to $rocess a (#le results #n the (ollow#n'4
int main() ' std::string te=t_Cile_name " #person_records$t=t# std::deBue<person> person_list strtk::Cor_eac7_line(Cile_name,predicate_type(person_list)) return : (

Be(ore we cont#nue on w#th the e.a)$le, a re wr#tten vers#on o( the a,ove code us#n' *++11 la),das #s as (ollows4
int main() ' std::string te=t_Cile_name " #person_records$t=t# std::deBue<person> person_list person p strtk::7e=_to_number_sink<unsigned int> 7e=_sink strtk::Cor_eac7_line(te=t_Cile_name, %&&(const std::string& line) ' iC (strtk::parse(line,#|#,7e=_sink,p$name,p$age,p$7eig7t,p$Deig7t)) container_$pus7_back(p) else std::cerr << #Railed to parse: # << line << std::endl () return : (

"o )a%e th#n's eas#er one could ada$t a struct (made up entirely of ./0s) to a $arser. "h#s )a%es the usa'e synta. l#ttle eas#er to (ollow. An e.a)$le o( th#s ada$t#on #s as (ollows4
struct type ' std::string s double d int i c7ar c bool b ( strtk_parse_begin(type) strtk_parse_type(s) strtk_parse_type(d) strtk_parse_type(i) strtk_parse_type(c) strtk_parse_type(b) strtk_parse_end() int main() ' type t std::string s " #abcdeCg7iGklmnop|8;<$E9F|A@?F9E<;8|J|8# strtk::parse(s,#|#,t)

return : (

Another s#)#lar e.a)$le to the $rev#ous, would ,e $ars#n' a te.t (#le o( 3D coord#nates #nto a se/uence. "h#s can ,e done eas#ly and cleanly us#n' la),das and -tr"% as (ollows4
struct point ' double =,y,> ( int main() ' std::string point_data " #point_data$t=t# std::deBue<point> points point p strtk::Cor_eac7_line(point_data, %&&(const std::string& str) ' iC (strtk::parse(str,#,#,p$=,p$y,p$>)) points$pus7_back(p) () return : (

Si( le +ate-Ti(e 'arser


Assu)#n' the datet#)e struct de(#ned ,elow, and a str#n' re$resentat#on o( a co),#ned date and t#)e #n the (or) o( HHHH 99 DD ::4994--.9- e'4 !001-05-!5 11/16/0".567
struct datetime ' unsigned int unsigned int unsigned int unsigned int unsigned int unsigned int unsigned int (

year mont7 day 7our minute second msecond

"he (ollow#n' assu)es an #n$ut o( date t#)e values se$arated ,y a $#$e. "o (ac#l#tate $ars#n' o( a date t#)e ,y the strtk::parse rout#ne #nto an -"5 co)$at#,le se/uence an #)$le)entat#on o( string_to_type_converter_impl s$ec#(#c to the datet#)e ty$e #s re/u#red. "he (ollow#n' de)onstrates how such a rout#ne can ,e 'enerated and used w#th#n the strtk::parse conte.t4
strtk_string_to_type_begin(datetime) static const std::string delimiters (#4:$ #) return strtk::parse(begin, end, delimiters, t$year, t$mont7, t$day, t$7our, t$minute, t$second, t$msecond) strtk_string_to_type_end()

A s#)$le e.a)$le o( us#n' strtk::parse #n con1unct#on w#th the newly #nte'rated datet#)e var#a,le )#.ed w#th var#a,les o( other ty$es #s as (ollows4

int main() ' const std::string data " #abc 8;< =y>,;:::4:848: :<::8:8F$8;<| 6A@?F9$E<;8:# std::string var: datetime var8 double var; strtk::parse(data,#,|#,var:,var8,var;) ( return :

Br#n'#n' the a,ove $#eces to'ether, #n the (ollow#n' we can then $roceed to $arse a se/uence o( date t#)e str#n's del#)#ted ,y $#$e >?> #nto a de/ue o( ty$e datet#)e.
int main() ' static const std::string data " #;:::4:848: :9:8;:;E$;<E|# #;::;4:<48< :A:<E:E?$E9F|# #;::E4:9489 89:9?::<$F?@|# #;::F4:?48? 8A::F::;$@:A|# #;::@4:A48@ ;<:8;::<$?A@|# #;:8:4884;< 89:<9::@$8F@# std::deBue<datetime> datetime_list strtk::parse(data,#|#,datetime_list) Cor (std::si>e_t i " : i < datetime_list$si>e() 66i) ' datetime& dt " datetime_list%i& std::cout << dt$year << #4# << strtk::te=t::rig7t_align(;,':', dt$mont7) << strtk::te=t::rig7t_align(;,':', dt$day) << strtk::te=t::rig7t_align(;,':', dt$7our) << strtk::te=t::rig7t_align(;,':', dt$minute) << strtk::te=t::rig7t_align(;,':', dt$second) << strtk::te=t::rig7t_align(<,':',dt$msecond) ( return :

:<::8:8F$8;<|;::84:;4;; :?:;<:<;$<E9|;::<4:E4;E 88:EF:98$?F?|;::94:F4;F 8?:E9:<8$9F8|;::?4:@4;F ;8:8F:;<$;F?|;::A48:4;F 8<:E?:88$AF<|;:8848;4;F

<< << << << << <<

#4# # # #:# #:# #$# std::endl

As a s#de note, the )ore co))only used date, t#)e and date t#)e (or)ats can ,e eas#ly $arsed w#th a s#)$le ut#l#t#es l#,rary ,ased on -tr"% called 0atetime12tils "he l#,rary )a%es use o( the techn#/ue descr#,ed a,ove #n con1unct#on w#th the strtk::oCCset_splitter to $rov#de e((#c#ent and h#'h $er(or)ance $arsers (or (or)ats such as the ones denoted ,elow4 8or)at
UUUUVVSS UUUUSSVV

0.a)$le
;::F:<:E ;::F:E:<

8or)at
UUUU*VV*SS UUUU*SS*VV SS*VV*UUUU WW:VV:XX$mss WW:VV:XX UUUUVVSS WW:VV:XX$mss UUUU*VV*SS WW:VV:XX$mss SS*VV*UUUU WW:VV:XX$mss UUUUVVSS WW:VV:XX UUUU*VV*SS WW:VV:XX SS*VV*UUUU WW:VV:XX UUUU4VV4SS WW:VV:XX$mss SS4VV4UUUU WW:VV:XX UUUU4VV4SSTWW:VV:XX UUUU4VV4SSTWW:VV:XX$mss UUUUVVSSTWW:VV:XX UUUUVVSSTWW:VV:XX$mss

0.a)$le
;::F*:<*:E ;::F*:E*:< :E*:<*;::F 8<:;?:9E$8;< 8<:;?:9E ;::F:<:E 8<:;?:9E$8;< ;::F*:<*:E 8<:;?:9E$8;< :E*:<*;::F 8<:;?:9E$8;< ;::F:<:E 8<:;?:9E ;::F*:<*:E 8<:;?:9E :E*:<*;::F 8<:;?:9E ;::F4:<4:E 8<:;?:9E$8;< :E4:<4;::F 8<:;?:9E ;::F4:<4:ET8<:;?:9E ;::F4:<4:ET8<:;?:9E$8;< ;::F:<:ET8<:;?:9E ;::F:<:ET8<:;?:9E$8;<

7n the (ollow#n' s#)$le e.a)$le we have an array o( data re$resent#n' tu$les o( trade e.ecut#ons #n *-B (or)at. "he o,1ect#ve #s to $o$ulate the tradeEl#st #nstance w#th the '#ven data v#a the de(#ned trade struct. 7n the e.a)$le the dt_utils::datetime_CormatF date t#)e $arser #s used, #t $o$ulates a 'eneral date t#)e ty$e #nstance called dt_utils::datetime. 7( the $arse o$erat#on succeeds, then the date t#)e co)$onents the trade ty$e re/u#res are u$dated and the #nstance #tsel( #s su,se/uently added to the trade_list.
struct trade ' std::string ticker double price unsigned int volume unsigned s7ort 7r,min,sec,ms ( int main() ' std::string trade_data%& " ' #;::F4:<4:E #;::F4:<4:E #;::F4:<4:E #;::F4:<4:E #;::F4:<4:E #;::F4:<4:E #;::F4:<4:E #;::F4:<4:E (

8<:;?:9E$8;<,JIP,8;$<E?,FF?F#, 8<:;?:9E$;<8,YUZ,;<$E99,?89E?#, 8<:;?:9E$<8;,![\,<E$9F;,98E#, 8<:;?:99$;F<,Q]T,F?$@A<,9AEA#, 8<:;?:99$<;?,JIP,?@$AF<,8A#, 8<:;?:99$9;E,YUZ,E9$F??,8;?F#, 8<:;?:9F$F;<,![\,<F$8@;,FF?F#, 8<:;?:9F$@??,Q]T,F;$<<A,89E?#

std::deBue<trade> trade_list trade t dt_utils::datetime dt dt_utils::datetime_CormatF dtF(dt) Cor (std::si>e_t i " : 6i) i < si>eoC(trade_data) * si>eoC(std::string) 6

' bool result " strtk::parse(trade_data%i&,#,#, dtF,t$ticker,t$price,t$volume) iC (result) ' t$7r " td$7our t$min " td$minute t$sec " td$second t$ms " td$millisecond trade_list$pus7_back(t) (

( (

return :

'arsing Su8-3ists
-o (ar the de)onstrated ca$a,#l#t#es o( the strtk::parse (unct#on has ,een ,ased on $ass#n' a ser#es o( $ara)eters that are $o$ulated #n a l#near (ash#on as the $arser $rocesses the to%ens #t encounters. "hat sa#d, so)e (or)ats have the#r own su, structures, a s#)$le e.a)$le would ,e a l#st o( values (such as #nte'ers) that need to ,e loaded #nto a de/ue or stac%. -tr"% $rov#des a ser#es o( s#n% (ac#l#t#es that consu)e a ran'e and an -"5 conta#ner wh#ch can ,e (orwarded onto strtk::parse. 7n the (ollow#n' e.a)$le, the data str#n' #s co)$r#sed o( 3 se$arate l#sts del#)#ted ,y a pipe "|". An #nte'er, a str#n' and a dou,le ty$e l#st. 0ach l#st #s to ,e $arsed #nto an -"5 conta#ner o( a$$ro$r#ate ty$e. 7n th#s case a vector, a de/ue and a l#st. -tr"% $rov#des the a,#l#ty to #nstant#ate a s#n% (or the s$ec#(#c conta#ner ty$e that #s co)$at#,le w#th strtk::parse.
int main() ' std::string data " #8,6;,4<,E|abc,iGk,rst,=y>|8;<$E9F,6;<E$9F?,4 <E9$F?@,E9F$?@A,9F?$@A:# **deCine containers std::vector<int> int_vector std::deBue<std::string> string_deBue std::list<double> double_list std::set<int> int_set std::Bueue<std::string> string_Bueue std::stack<double> double_stack std::priority_Bueue<int> int_priority_Bueue **deCine sinks strtk::vector_sink<int>::type strtk::deBue_sink<std::string>::type strtk::list_sink<double>::type strtk::set_sink<int>::type strtk::Bueue_sink<std::string>::type strtk::stack_sink<double>::type strtk::priority_Bueue_sink<int>::type strtk::parse(data,#|#,vec_sink( vec_sink(#,#) deB_sink(#,#) lst_sink(#,#) set_sink(#,#) Bue_sink(#,#) stk_sink(#,#) prB_sink(#,#)

int_vector),

deB_sink(string_deBue), lst_sink( double_list)) strtk::parse(data,#|#,set_sink( int_set), Bue_sink( string_Bueue), stk_sink( double_stack), prB_sink(int_priority_Bueue)) return : (

7( only a certa#n nu),er o( ele)ents #n the l#st are re/u#red, (or e.a)$le only the (#rst 3, then the ele)ent count on the s#n% can ,e set a$$ro$r#ately. "he a,ove e.a)$le could ,e )od#(#ed as (ollows4
int main() ' std::string data " #8,6;,4<,E|string:|abc,iGk,rst,=y>|string8| 8;<$E9F,6;<E$9F?,4<E9$F?@,E9F$?@A,9F?$@A:# std::vector<int> int_vector std::deBue<std::string> string_deBue std::list<double> double_list strtk::vector_sink<int>::type vec_sink(#,#) strtk::deBue_sink<std::string>::type deB_sink(#,#) strtk::list_sink<double>::type lst_sink(#,#) std::string string_: std::string string_8 strtk::parse(data,#|#,vec_sink( int_vector)$count(;), ; values string_:, deB_sink(string_deBue)$count(<), < values string_8, lst_sink( double_list)$count(E)) E values return : ( ** consume Cirst ** consume Cirst ** consume Cirst

*ote/ 7( there aren;t enou'h ele)ents #n a $art#cular l#st, then $ars#n' o( that l#st (a#ls and su,se/uently the whole strtk::parse call w#ll (a#l as well.

'arsing Trailing-3ists
Another way one )#'ht want to $arse a tu$le o( values )#'ht ,e to $arse a $re(#. o( values #nto a s$ec#(#c nu),er o( $oss#,ly vary#n' ty$es, then to $arse the re)a#n#n' values (assu)#n' they are all o( the sa)e ty$e) #nto a se/uence or l#st etc. -tr"% $rov#des the (ollow#n' s#)$le solut#on to the '#ven $ro,le), as de)onstrated ,elow4
int main() ' ' std::string data " #J Xtring Oalue,888$888,;;;$;;;,<<<$<<<,EEE$EEE,999$999#

std::string token std::vector<double> double_list strtk::parse(data,#,#,token,double_list) ( ' std::string data " #J Xtring Oalue,:84:;4 ;::<,888$888,;;;$;;;,<<<$<<<,EEE$EEE,999$999# std::string token std::string date std::deBue<double> double_list strtk::parse(data,#,#,token,date,double_list) ( ' std::string data " #J Xtring Oalue,:84:;4 ;::<,8;<E9F?@A,888$888,;;;$;;;,<<<$<<<,EEE$EEE,999$999# std::string token std::string date int i std::list<double> double_list strtk::parse(data,#,#,token,date,i,double_list) ( ' std::string data " #J Xtring Oalue,:84:;4 ;::<,8;<E9F?@A,888$888,;;;$;;;,<<<$<<<,EEE$EEE,999$999# std::string token std::string date int i double d std::vector<double> double_list strtk::parse(data,#,#,token,date,i,d,double_list) ( ' std::string data " #J Xtring Oalue,:84:;4 ;::<,8;<E9F?@A,888$888,;;;$;;;,<<<$<<<,EEE$EEE,999$999# std::string token std::string date int i double d8 double d; std::deBue<double> double_list strtk::parse(data,#,#,token,date,i,d8,d;,double_list) ( return : (

).tending The +ate-Ti(e 'arser ).a( le


Bu#ld#n' u$on the $rev#ous datet#)e e.a)$le, Ge are $resented w#th a tu$le o( data that re$resents an astrono)#cal event. "he event de(#nes a na)e, a locat#on and a ser#es o( date t#)es #n F"* the event was o,served. 7n order to construct the necessary s#n%(s) that w#ll ,e used (or $ars#n' the re/u#red ty$e #nto a conta#ner, the )acro strtk9register9userde%9t$ e9sink w#th the s$ec#(#ed ty$e #s #nvo%ed. "he (ollow#n' #s a de(#n#t#on o( the struct one )#'ht construct4
struct event_inCormation ' std::si>e_t id std::string name

std::string location std::deBue<datetime> observation_datetimes

strtk_register_userdeC_type_sink(datetime)

Br#n'#n' the a,ove to'ether w#th a call to strtk::parse results #n the (ollow#n' code wh#ch $arses the event data tu$le #nto the allocated event_inCormation #nstance.
int main() ' std::string event_data " #8?;EA<|Nunar !mpact|Vare TranBuillitatis|# #;:8:4:848A :::;@:E9$<9?,;:8:4:;48@ :::9?::?$8:A,# #;:8:4:<4;: :8:89:88$;F8,;:8:4:E4;8 :8::?:;?$A?;# strtk::deBue_sink<datetime>::type deB_sink(#,#) event_inCormation evt_inCo strtk::parse(event_data,#|#,evt_inCo$id, evt_inCo$name, evt_inCo$location, deB_sink(evt_inCo$observation_datetimes)) return : (

Token 'rocessing +uring 'arsing


-tr"% o((ers a set o( conven#ent and s#)$le to%en $rocess#n' $r#)#t#ves that can ,e used dur#n' a call to the strt%44$arse rout#ne to $er(or) var#ous act#ons u$on the to%ens ,e#n' $arsed. "hese act#ons #nclude such th#n's as )od#(#cat#ons and val#dat#ons o( to%ens. "he (ollow#n' #s a l#st o( to%en $rocess#n' $r#)#t#ves used (or constra#nt and ver#(#cat#on $ur$oses4

strt%44#'noreEto%en strt%44e.$ect strt%44#e.$ect strt%44l#%e strt%44#nran'e

"he (ollow#n' #s a l#st o( to%en $rocess#n' $r#)#t#ves used (or )od#(y#n' and nor)al#s#n' $ur$oses4

strt%44tr#) strt%44tr#)Elead#n' strt%44tr#)Etra#l#n' strt%44asElcase

strt%44asEucase

"he $r#)#t#ves all return e#ther a true or (alse value u$on $ars#n' co)$let#on, wh#ch #s then (urther used ,y the strtk::parse rout#ne to deter)#ne #( the $arse o$erat#on as a whole has succeeded or (a#led.

Ignore Token 'rocessing


"here )ay ,e scenar#os when '#ven a del#)#ted tu$le o( data, that one or )ore o( the to%ens need to ,e #'nored or s%#$$ed. -tr"% $rov#des a )echan#s) called strt%44#'noreEto%en that allows the $arser to consu)e s$ec#(#c to%ens eas#ly w#thout a((ect#n' overall $er(or)ance. Below #s an e.a)$le o( how strtk::ignore_token can ,e used #n con1unct#on w#th strtk::parse to s%#$ the 2nd and 4th to%ens #n the tu$le4
int main() ' static const std::string data " #68;<,ignore:,8;<$E9F,ignore8,abcdeC,ignore;# int i double d std::string s strtk::ignore_token ignore strtk::parse(data,#,#,i,ignore,d,ignore,s) std::cout << #i " # << i << std::endl std::cout << #d " # << d << std::endl std::cout << #s " # << s << std::endl return : (

). ect and I). ect Token 'rocessing


Ghen $ars#n' a tu$le, one )ay want to ensure that s$ec#(#c to%ens o( the tu$le are o( a certa#n str#n' value. -tr"% $rov#des th#s ty$e o( (unct#onal#ty v#a the strtk::e=pect and strtk::ie=pect )echan#s)s. "he strt%44e.$ect (or) en(orces an e.act str#n' )atch, whereas the strt%44#e.$ect en(orces only a case #nsens#t#ve )atch. "he (ollow#n' #s an e.a)$le where we atte)$t to $arse a ;$ascal l#%e; var#a,le declarat#on and de(#n#t#on. "he re/u#re)ent #s that the (#rst to%en ,e >var> (ollowed ,y a var#a,le na)e and then a ty$e na)e o( ;7nte'er; wh#ch #s not case sens#t#ve.
int main() ' static const std::string data " #var Coo : !nTe^eT " < # std::string variable_name int initial_value bool result " strtk::parse(data, # #, strtk::e=pect(#var#)$reC(), variable_name, strtk::e=pect(#:#)$reC(),

strtk::ie=pect(#!nteger#)$reC(), strtk::e=pect(#"#)$reC(), initial_value) iC (result) std::cout << variable_name << # " # << initial_value << std::endl else std::cout << #Railed to parse statement-# << std::endl ( return :

3ike Token 'rocessing


-#)#lar to the a,ove )ent#oned strt%44e.$ect and strt%44#e.$ect $r#)#t#ves, -tr"% $rov#des a s#)$le w#ldcard )atch#n' o( to%ens (unct#onal#ty v#a the strt%44l#%e )echan#s). "he s$ec#al characters o( ;I; and ;6; are used denot#n' ;&ero or )ore; and ;&ero or one; )atch )odes res$ect#vely. "he (ollow#n' e.a)$le uses the strt%44l#%e #n con1unct#on w#th strt%44e.$ect to $arse a tu$le o( %ey<value $a#rs.
int main() ' static const std::string data " #token:"68;< token8"abc token;"4 E9F$F?@ # int i std::string s double d strtk::parse(data, #" #, strtk::like(#to3n)#)$reC(), i, strtk::like(#token)#)$reC(), s, strtk::ie=pect(#tKkHn;#)$reC(), d) std::cout << #i " # << i << std::endl std::cout << #s " # << s << std::endl std::cout << #d " # << d << std::endl return : (

In--ange Token 'rocessing


Ghen $ars#n' to%ens, one )ay w#sh to deter)#ne #( the to%en when v#ewed #n #ts tar'et ty$e res#des w#th#n a s$ec#(#ed ran'e Jr0,r1K. As the to%ens can ,e o( any ty$e, not necessar#ly 1ust str#n' or nu)er#cal #n nature, the ty$e )ust have a less than co)$ara,le attr#,ute. "he (ollow#n' e.a)$le atte)$ts to $arse a %ey<value tu$le that conta#ns a te)$erature and a na)e co)$onent.
int main() ' static const std::string data " #temperature"68;<$E9F name"Tumpelstil>c7en#

double temperature std::string name strtk::parse(data, #" #, **Qrocess temperature section strtk::e=pect(#temperature#)$reC(), strtk::inrange(temperature,4E<;$8,6E<;$8)$reC(), **Qrocess name section strtk::e=pect(#name#)$reC(), strtk::inrange(name,#JJJJ#,#>>>>#)$reC()) std::cout << #temperature " # << temperature << std::endl std::cout << #name " # << name << std::endl return : (

Tri( Token 'rocessing


At t#)es to%ens w#th#n a tu$le )ay have $add#n' added to e#ther the le(t, r#'ht or ,oth ends. Ghen $rocess#n' the to%en #t )ay,e necessary to re)ove the su$er(luous $add#n' ,e(ore atte)$t#n' to convert the str#n' or ran'e re$resentat#on o( the to%en #nto #ts tar'et ty$e(#nt, dou,le etc). "he e.a)$le ,elow, de)onstrates the use o( var#ous (or)s o( to%en tr#))#n' #n con1unct#on strt%4$arse.
int main() ' ' std::string data " #3333abc8;<3333,3333abc8;<3333,3333abc8;<3333# std::string s: std::string s8 std::string s; strtk::parse(data,#,#, strtk::trim(#3#,s:)$reC(), strtk::trim_leading (#3#,s8)$reC(), strtk::trim_trailing(#3#,s;)$reC()) std::cout << #s: " %# << s: << #&# << std::endl std::cout << #s8 " %# << s8 << #&# << std::endl std::cout << #s; " %# << s; << #&# << std::endl **H=pected Kutput: **s: " %abc8;<& **s8 " %abc8;<3333& **s; " %3333abc8;<& ( ' std::string data " #3)3)a string3)3),3)38;<E9F,8;<$E9F)3)3)# std::string s int i double d strtk::parse(data,#,#,

strtk::trim(#3)#,s)$reC(), strtk::trim_leading (#)3#,i)$reC(), strtk::trim_trailing(#3)#,d)$reC()) std::cout << #s " %# << s << #&# << std::endl std::cout << #i " %# << i << #&# << std::endl std::cout << #d " %# << d << #&# << std::endl **H=pected Kutput: **s " %a string& **i " %8;<E9F& **d " %8;<$E9F&

( (

return :

Case *or(alisation Token 'rocessing


Another $a#r o( to%en $rocess#n' )echan#s)s $rov#ded ,y -tr"%, are the strt%44asElcase and strt%44asEucase )echan#s)s. "hey convert the str#n' re$resentat#on o( the to%en to lowercase and u$$ercase characters res$ect#vely. "he (ollow#n' e.a)$le, $arses a two to%en tu$le, and converts the (#rst to%en s0 to all lowercase and the second to%en s1 to all u$$ercase.
int main() ' std::string data " #JbPd,HC^7!# std::string s: std::string s8 strtk::parse(data,#,#, strtk::as_lcase(s:)$reC(), strtk::as_ucase(s8)$reC()) std::cout << #s: " %# << s: << #&# << std::endl std::cout << #s8 " %# << s8 << #&# << std::endl **H=pected Kutput: **s: " %abcd& **s8 " %HR^W!& ( return :

'arsing Truncated :alues


"here )ay ,e t#)es dur#n' $ars#n' when a to%en wh#ch #s #ntended to ,e $arsed as an #nte'ral ty$e (e'4 #nt, short, uns#'ned #nt et al) )ay ,e re$resented us#n' dec#)al notat#on (e'4 1234.00000). Dor)ally #( the to%en were to ,e $assed as #s #t would cause a convers#on error due to the (act that there are #nval#d characters w#th#n the to%en. -tr"% $rov#des a (ac#l#ty called strtk::truncated_int that can ,e used w#th ,oth s#'ned and uns#'ned #nte'ral ty$es. "he ty$e truncatedE#nt #s s$ec#al#sed w#th the re/u#red ty$e, then

an #nstance o( the ty$e #s re'#stered w#th #t e#ther $r#or to or #nl#ne w#th the convers#on<$arse call. Ghen the convers#on occurs strt%44truncatedE#nt s#)$ly rede(#nes the end o( the to%en ran'e to ,e the dec#)al $o#nt (if it is present) and then $asses #t alon' to the a$$ro$r#ate string_to_type_converter_impl call.
int main() ' ' **Ponvert decimal representation to an int int i " : std::string data " #48;<E$::::# strtk::truncated_int<int> ti strtk::string_to_type_converter(data,ti(i)) ( ' **Qarse oC tuple oC decimal values into ints i: and i8 int i: " : int i8 " : std::string data " #48;<E$::::|E9F$?@A:# strtk::truncated_int<int> ti: strtk::truncated_int<int> ti8 strtk::parse(data,#|#,ti:(i:),ti8(i8))

( (

An o$t#onal $ara)eter that can ut#l#&ed #s the ;Cractional_si>e; wh#ch denotes the e.act nu),er o( d#'#ts a(ter the dec#)al $lace that #s e.$ected. 7n the event th#s cond#t#on #s not )et a convers#on error #s returned.
int main() ' std::string data " #48;<E$:::# strtk::truncated_int<int> ti ti$Cractional_si>e(<) int i " : iC (strtk::string_to_type_converter(data,ti(i))) std::cout << #i: # << i << std::endl else std::cout << #Hrror parsing: # << data << std::endl return : (

7n the (ollow#n' e.a)$le, we have an array o( trade e.ecut#on tu$les #n csv (or)at. "he tu$le #s co)$r#sed o( the (ollow#n' (#elds4 t#c%er na)e (string), trade #d (uint&+ right aligned with space as padding), e.ecut#on $r#ce (double), e.ecuted volu)e (unsigned int with + decimal place suffi ). "he struct trade w#ll ,e used to store the tu$les #n )e)ory. "he o,1ect#ve #s to $arse each tu$le and $o$ulate the tradeEl#st structure w#th all the trades, not#n' any errors that occur alon' the way.
struct trade ' unsigned long long id std::string ticker double price

unsigned int (

volume

int main() ' std::string trade_data%& " ' #JJJ, :,<F$A::,::A<9;$::::#, #III, 8,EF$:::,::A@::$::::#, #PPP, ;,EF$:::,::AE::$::::#, #SSS, <,EF$:::,::89::$::::#, #JJJ, E,EF$:::,:::F::$::::#, #III, 9,EF$:::,:<A@::$::::#, #PPP, F,EF$:::,::<;::$::::#, #SSS, ?,EF$:::,:::8::$::::#, #JJJ, @,EF$:::,:::@::$::::#, #III, A,E@$9::,::;E::$::::#, #PPP, 8:,<?$<A9,8:8;::$::::#, #SSS, 88,<?$8A<,;:::::$::::#, #JJJ, 8;,<?$8EF,:;::::$::::#, #III, 8<,<@$<8E,::8?9;$::::#, #PPP, 8E,<?$<@F,8<::::$::::#, #SSS, 89,<?$EE<,8:::::$::::# ( std::deBue<trade> trade_list strtk::truncated_int<unsigned int> tui tui$Cractional_si>e(E) Cor (std::si>e_t i " : 6i) ' trade t bool result " strtk::parse(trade_data%i&,#,#, t$ticker, strtk::trim_leading(# #,t$id)$reC(), t$price, tui(t$volume)) iC (result) trade_list$pus7_back(t) else std::cout << #Tuple parse error: # << trade_data%i& << std::endl ( ( return : i < si>eoC(trade_data) * si>eoC(std::string) 6

7t should ,e noted that truncated_int can ,e used #n con1unct#on w#th the var#ous -tr"% to%en $rocess#n' $r#)#t#ves such as strtk::trim, strtk::trim_leading, strtk::as_lcase et al . "he (ollow#n' e.a)$le de)onstrates $ars#n' a tu$le o( values #ntended (or #nt ty$es, where the to%ens have rando) s$ace $add#n'. "he s#)$le co)$os#t#on o( strt%44truncatedE#nt and strt%44tr#) allows (or e((#c#ent and error (ree $ars#n' o( the tu$le.
int main() ' std::string data " # int i:,i8,i;

8;<$: |

E9F$::

| ?@A$:::#

strtk::truncated_int<int> ti: strtk::truncated_int<int> ti8 strtk::truncated_int<int> ti; strtk::parse(data,#|#, strtk::trim(# #,ti8(i:))$reC(), strtk::trim(# #,ti8(i8))$reC(), strtk::trim(# #,ti;(i;))$reC()) printC(#i: " %1d& i8 " %1d& i; " %1d&5n#,i:,i8,i;) ** i: " %8;<& i8 " %E9F& i; " %?@A& return : (

Colu(n;ise 'arsing
7n the $rev#ous sect#on the a,#l#ty to #'nore to%ens #n a tu$le was d#scussed. "he conce$t wor%s well #( only a (ew to%ens need to ,e #'nored. :owever $ro,le)s ar#se when the tu$les conta#n a lar'e nu),er o( to%ens and the to%ens that are to ,e #'nored are nu)erous and d#str#,uted un#(or)ly over the ent#re tu$le. -#tuat#ons such as th#s one are co))on, and us#n' the #'noreEto%en techn#/ue can not only )a%e ,oth the cod#n' o( the solut#on cu),erso)e and error $rone ,ut also )a%e the $ars#n' $rocess #tsel( /u#te #ne((#c#ent. A natural e.tens#on to #'noreEto%en that scales and #s also e.tre)ely e((#c#ent, can ,e (ound #n the co),#ned (unct#onal#t#es o( the $arseEcolu)ns and colu)nEl#st. "he colu)nEl#st #s a structure used to hold the #nde.es o( the to%ens #n the tu$le that are re/u#red. "he #nde.es have to ,e val#d, un#/ue and #n ascend#n' order.
**even inde=es %:,8;& auto cl_even " strtk::column_list(:,;,E,F,@,8:,8;) **odd inde=es %8,?& auto cl_odd " strtk::column_list(8,<,9,?)

"he strtk::parse_columns (unct#on ta%es a str#n' o( data re$resent#n' a tu$le, a del#)#ter to deter)#ne the to%ens #n the tu$le, a strkt::column_list and a co)$at#,le nu),er o( ty$es as tar'et re(erences. "he nu),er o( ty$es has to ,e e/ual to the nu),er o( #nde.es #n the colu)nEl#st, and the ty$es need to ,e convert#,le w#th#n the strt% na)es$ace (ro) a str#n' ran'e re$resentat#on to the ty$e. 7n the (ollow#n' e.a)$le we have a tu$le cons#st#n' o( #nte'ers. Ge;re only #nterested #n the (#rst (our even nu),ered #nde.es #n the tu$le, the code ,elow de)onstrates how the tu$le #s $arsed w#th the '#ven constra#nts4
int main() ' const std::string data " #8,;; <<<|EEEE 99999|FFFFFF ???????,@@@@@@@@ AAA # #::::|8888,;;;;;,<<<<<<,EEEEEEE# int i:,i8,i;,i<,iE

strtk::parse_columns(data, #, | #, strtk::column_list(:,;,E,F,@), i:,i8,i;,i<,iE) return :

7n the (ollow#n' e.a)$le we have a tu$le cons#st#n' o( a )#.ture o( ty$es. Ge are only #nterested #n the (#rst, (#(th and e#'hth #nde.es #n the tu$le, wh#ch ha$$en to ,e o( ty$e #nt, dou,le and std::string res$ect#vely. "he code ,elow de)onstrates how the tu$le #s $arsed w#th the '#ven constra#nts4
int main() ' const std::string data " #token:,8;<E token; token<,tokenE,88$;;| tokenF token?|# #te=t4i4te=t,8 ; < E 9 F ?# int i double d std::string s strtk::parse_columns(data, #, | #, strtk::column_list(8,9,@), i,d,s) return :

Si( le "+ ,esh 4ile 4or(at 'arser


5ets assu)e there #s a (#le (or)at (or e.$ress#n' a )esh. "he (or)at cons#sts o( a sect#on that de(#nes the un#/ue verte.es #n the )esh, and another sect#on that de(#nes the tr#an'les #n the )esh as a tu$le o( three #nd#c#es #nd#cat#n' the verte.es used. 0ach sect#on #s $receded ,y an #nte'er that denotes the nu),er o( su,se/uent ele)ents #n that sect#on. An e.a)$le o( such a (#le #s the (ollow#n'4
9 68$:,68$:,68$: 48$:,68$:,48$: 48$:,48$:,68$: 68$:,48$:,48$: 6:$:,6:$:,6:$: E :,8,E 8,;,E ;,<,E <,8,E

*ode to $arse such a (#le (or)at #s as (ollows4


struct point ' double =,y,> (

struct triangle ' std::si>e_t i:,i8,i; ( int main() ' std::string mes7_Cile " #mes7$t=t# std::iCstream stream(mes7_Cile$c_str()) std::string s ** Qrocess points section std::deBue<point> points point p std::si>e_t point_count " : strtk::parse_line(stream,# #,point_count) strtk::Cor_eac7_line_n(stream, point_count, %&points,&p&(const std::string& line) ' iC (strtk::parse(line,#,#,p$=,p$y,p$>)) points$pus7_back(p) () ** Qrocess triangles section std::deBue<triangle> triangles triangle t std::si>e_t triangle_count " : strtk::parse_line(stream,# #,triangle_count) strtk::Cor_eac7_line_n(stream, triangle_count, %&triangles,&t&(const std::string& line) ' iC (strtk::parse(line,#,#,t$i:,t$i8,t$i;)) triangles$pus7_back(t) () return :

Si( le Se(antic Actions


A se)ant#c act#on #s an act#on that #s underta%en u$on a to%en, #t can ,e #n the (or) o( e#ther an assess)ent or a )an#$ulat#on. -tr"% $rov#des a very s#)$l#(#ed se)ant#c act#on ca$a,#l#ty, na)ed strtk::util::semantic_action (or ty$es that are ,e#n' $arsed v#a the strtk::parse (unct#on. A (unct#on (state(ul or stateless), ta%#n' an #terator ran'e #s used to construct the se)ant#cEact#on. "he ,ody o( the (unct#on $er(or)s whatever o$erat#ons are re/u#red and also )a%es sure to )a#nta#n the contract w#th re'ards to the return status (or the $arse rout#ne to co)$lete success(ully. "he (ollow#n' #s an e.a)$le where a co))a del#)#ted str#n' #s $arsed #nto 3 ty$es, an #nte'er, a dou,le and a str#n'. "he rules re'ard#n' $ars#n' and u$dat#n' o( the var#a,les #s as (ollows, the #nt var#a,le "i" w#ll only ,e u$dated #( the value $arsed #s odd, the dou,le value "d" w#ll only ,e u$dated #( the $arsed value #s 'reater than or e/ual to 99.99 and the str#n' value "s" w#ll only ,e u$dated #( the $resented ran'e conta#ns the str#n' >r#n'>. F$on every success(ul u$date a corres$ond#n' counter w#ll ,e #ncre)ented.
int main() '

std::string data " #8;<E9,8;<$E9F,J string# int i " : double d " :$: std::string s " ## std::si>e_t i_update_count " : std::si>e_t d_update_count " : std::si>e_t s_update_count " : typedeC strtk::range::ustring::const_iterator itr_type using strtk::util::semantic_action strtk::parse(data,#,#, **Token_: (i) 4 Qarse and update iC value is odd semantic_action(%&i,&i_update_count& (itr_type begin,itr_type end) 4> bool ' int temp " : iC (strtk::string_to_type_converter(begin,end,temp)) ' iC (temp 1 ; -" :) ' i " temp 66i_update_count ( return true ( else return Calse ()$reC(), **Token_8 (d) 4 Qarse and update iC value is greater t7an AA$AA semantic_action(%&d,&d_update_count& (itr_type begin,itr_type end) 4> bool ' double temp " :$: iC (strtk::string_to_type_converter(begin,end,temp)) ' iC (temp >" AA$AA) ' d " temp 66d_update_count ( return true ( else return Calse ()$reC(), **Token_; (s) 4 Qarse and update iC value contains #ring# semantic_action(%&s,&s_update_count& (itr_type begin,itr_type end) 4> bool ' static unsigned c7ar pattern%& " #ring# iC (end -" std::searc7(begin,end,pattern,pattern 6 E)) ' s$assign(begin,end) 66s_update_count ( return true

()$reC()) std::cout << #i: # << i << std::endl std::cout << #d: # << d << std::endl std::cout << #s: # << s << std::endl return : (

'ick A -ando( 3ine 4ro( A Te.t 4ile


A rando) l#ne o( te.t #s to ,e selected (ro) a user $rov#ded te.t (#le co)$r#sed o( l#nes o( vary#n' len'th, such that the $ro,a,#l#ty o( the l#ne ,e#n' selected #s 1<D where D #s the nu),er o( l#nes #n the te.t (#le. "here are )any tr#v#al solut#ons to th#s $ro,le), however #( one were to (urther constra#n the $ro,le) ,y #nd#cat#n' the (#le #s very lar'e ("Bs) and that the syste) u$on wh#ch the solut#on #s to run #s very l#)#ted )e)ory w#se, )ost #( not all tr#v#al solut#ons such as stor#n' #nde.es o( all l#ne o((sets, or read#n' the ent#re (#le #nto )e)ory etc w#ll ,e el#)#nated. :owever, there e.#sts a very ele'ant solut#on to th#s $ro,le) o( 2(n), 2(1) t#)e and s$ace co)$le.#t#es res$ect#vely, that #nvolves scann#n' the ent#re (#le once l#ne ,y l#ne, and at every ith l#ne choos#n' whether or not to re$lace the (#nal result l#ne w#th the current l#ne ,y sa)$l#n' a un#(or) d#str#,ut#on o( ran'e J0,1) and $er(or)#n' a re$lace #( and only #( the rando) value #s less than 1 < #.

"he lo'#c ,eh#nd th#s solut#on revolves around the (act that the $ro,a,#l#ty o( select#n' the #th l#ne w#ll ,e 1<# and as such the total $ro,a,#l#ty (or select#n' any o( the $rev#ous # 1 l#nes w#ll ,e 1 (1<#) or (# 1)<#. Because there are (# 1) l#nes ,e(ore the #th l#ne, we d#v#de the $rev#ous su) o( $ro,a,#l#t#es ,y (# 1), result#n' #n a select#on $ro,a,#l#ty o( 1<# (or any one o( the l#nes u$ to and #nclud#n' the #th l#ne. 7( the #th l#ne were to ,e the last l#ne o( the te.t (#le, th#s then results #n each o( the l#nes hav#n' a select#on $ro,a,#l#ty o( 1<D s#)$le and sweet, and so too #s the (ollow#n' #)$le)entat#on o( sa#d solut#on4
int main(int argc, c7ar3 argv%&) ' std::string Cile_name " argv%8& std::string line std::si>e_t i " : strtk::uniCorm_real_rng rng(static_cast<std::si>e_t>(::time(:))) strtk::Cor_eac7_line(Cile_name, %&&(const std::string& s)

' iC (rng() < (8$: * 66i)) ' line " s ( () std::cout << line << std::endl return :

Ghat chan'es to the a,ove code would ,e re/u#red 7( the $ro,a,#l#ty o( l#ne select#on was chan'ed to 1<(D !) where 0 @L ! @L D and ! #s the nu),er o( l#nes that w#ll ,e rando)ly #'nored. *ote/ "A2*P Bolu)e 77 sect#on 3.4.2 has an #n de$th d#scuss#on a,out th#s $ro,le), wh#ch #s 'enerally %nown as reservo#r sa)$l#n', and other s#)#lar $ro,le)s relat#n' to rando) d#str#,ut#ons. Also one should note that the a,ove e.a)$le has an #ne((#c#ency, #n that u$on every str#n' re$lace, an actual str#n' #s ,e#n' co$#ed, nor)ally all one needs to do #s )a#nta#n a (#le o((set to the ,e'#nn#n' o( the l#ne, not do#n' th#s causes slow downs due to cont#nuous )e)ory allocat#ons<reallocat#ons wh#ch #s )ade all the worse when lar'e l#nes are encountered.

The <u$ 3o; And Sell =igh 'ro8le(


Assu)e we are '#ven a $#ece o( data #n csv (or)at wh#ch re$resents a t#)e ser#es (or the close $r#ces o( the -PD3 -MP N00 0"8. "he t#)e ser#es ran'es (ro) 04<01<1999 to 11<11<2011. "he o,1ect#ve #s to select two dates, the (#rst ,e#n' the ,uy date and the second ,e#n' the sell date, such that #( we were to ,uy then sell = shares o( the 0"8 we w#ll have )a.#)#&ed our $ro(#t. 7t should ,e noted that o( course the ,uy date )ust ,e ,e(ore the sell date and that short sell#n' #s not an o$t#on #n th#s strate'y. "he (ollow#n' #s a chart that re$resents the $r#ce o( the -PD3 over the '#ven $er#od4

"hrou'h v#sual #ns$ect#on we can a$$ro.#)ate that the ,est ,uy date would ,e towards the end o( 2002 and the corres$ond#n' sell date would ,e towards the end o( 200O, as these t#)e $o#nts see) to '#ve the lar'est d#((erence ,etween the two $r#ces. :owever #t also does see)s that a ,uy rou'hly at the start o( 2009 and a sell at the ,e'#nn#n' o( 2011 could also $rov#de such a lar'e $r#ce d#((erence. As such the v#sual #ns$ect#on a$$roach has lead to an a),#'u#ty, hence a )ore thorou'h and $rec#se a$$roach #s re/u#red. 2ne could ta%e the na#ve a$$roach to solv#n' the ,uy low sell h#'h $ro,le), ,y #n#t#ally load#n' the ent#re t#)e ser#es #nto )e)ory, then (or each #th t#)e $o#nt ta%e the $r#ce# and test #t a'a#nst every other $r#ce1 where # @ 1, and )a#nta#n a "best profit encountered" structure that conta#ns the ,est $ro(#t so (ar and the corres$ond#n' ,uy<sell $r#ces and dates. "h#s solut#on has a (ew $ro,le)s, #n#t#ally #t #s o( 2(D2) t#)e co)$le.#ty and 2(D) s$ace co)$le.#ty. As an e.a)$le (or one )#ll#on t#)e $o#nts #t w#ll re/u#re one tr#ll#on co)$ar#sons and one )#ll#on date<$r#ce un#ts o( stora'e. 7( )e)ory and co)$utat#onal $rocess#n' was l#)#ted on the hardware th#s solut#on can not ,e $ract#cally e.ecuted #n a cont#nuous onl#ne )anner. 8urther)ore as su''ested ,y the t#)e co)$le.#ty as the s#&e o( the data 'rows, re'ardless o( co)$utat#onal a,#l#t#es, the co)$ute t#)es (or the results would ,eco)e astrono)#cal and $ract#cally useless s$ec#ally (or real t#)e react#ve syste)s. 7n these ty$es o( $ro,le)s one tr#es to assess #( an onl#ne or strea)#n' ,ased solut#on #s (eas#,le. "hat #s a solut#on that does not re/u#re the data to ,e ava#la,le all at once, can wor% on the data #ncre)entally and re/u#res no )ore than one $ass (or each $#ece o( data. -uch a solut#on would ty$#cally have a t#)e co)$le.#ty o( 2(D) and a s$ace co)$le.#ty o( 2(1). G#th re'ards to th#s $ro,le) the cruc#al #ns#'ht re/u#red to convert the na#ve solut#on (ro) 2(D2) co)$le.#ty to an onl#ne vers#on o( 2(D) t#)e co)$le.#ty, #s that every new 'lo,al )#n#)a encountered #s the ,e'#nn#n' o( a new $er#od and an #nd#cator o( an end to the $rev#ous $er#od. 5oo%#n' at the chart, #( one were to scan (ro) le(t to r#'ht, the #ntu#t#ve

res$onse #s to (#nd the $o#nt w#th the lowest $r#ce, #'nore everyth#n' $reced#n', then try and (#nd the ne.t $o#nt w#th the h#'hest $r#ce or 'lo,al )a.#)a. "here are a (ew ed'e cases that need to ,e dealt w#th. "he )a#n one ,e#n' the $ro,le) descr#,ed a,ove that there are two ,uy<sell $o#nts that could $otent#ally ,e the solut#on. "he way around th#s #s to s#)$ly )a#nta#n the ,est encountered $er#od, and co)$are the $ro(#t (ro) any new $er#od to the ,est so (ar, #( #t #s ,etter (more) then re$lace the ,est w#th the current $er#od. Another ed'e case #s when the data #s #n a cont#nual decl#ne, #n a s#tuat#on l#%e th#s there w#ll ,e no profitable ,uy<sell $o#nts. "he (ollow#n' #s a s)all su,sect#on o( the t#)e ser#es #n /uest#on4 Download s$yN00.csv
89*:F*;:::,8E@$8F 8F*:F*;:::,8EF$9A 8A*:F*;:::,8E@$E? ;:*:F*;:::,8E?$AE ;8*:F*;:::,8E?$@@ ;;*:F*;:::,8E9$F< ;<*:F*;:::,8EE$<@ ;F*:F*;:::,8EF$;< ;?*:F*;:::,8E9$8F ;@*:F*;:::,8E9$9F ;A*:F*;:::,8EE$8A

"he code ,elow #s a very s#)$le s#n'le $ass #ncre)ental solut#on to the '#ven $ro,le). 7t reads every l#ne o( the #n$ut (#le, $arses each l#ne #nto a date and $r#ce var#a,le, chec%s to see #( the current $r#ce #s less than the current 'lo,al )#n#)a $r#ce, #( #t #s the case, #t w#ll set the current $er#od start to the current date and set the ,uy $r#ce to ,e the current $r#ce, otherw#se #t chec%s to see #( the current $r#ce #s lar'er than the 'lo,al )a.#)a $r#ce, #( that #s the case then #t u$dates the current $ro(#t, sell $r#ce and sell dates accord#n'ly. 7( at the end, the ,uy $r#ce #s not less than the sell $r#ce, #t #s #nd#cat#ve that there e.#sts no two t#)e $o#nts w#th#n the '#ven t#)e ser#es (or wh#ch a $ro(#ta,le transact#on could occur, otherw#se #t $r#nts out ,uy and sell dates (or the re/u#red transact#on and the e.$ected $ro(#t $er share.
struct period ' period() : proCit(std::numeric_limits<double>::min()), buy_price(std::numeric_limits<double>::ma=()), sell_price(std::numeric_limits<double>::min()) '( bool operator>(const period&p) const ' return proCit > p$proCit ( double proCit double buy_price double sell_price std::string buy_date std::string sell_date ( int main() ' period best_period period curr_period

strtk::Cor_eac7_line(#spy9::$csv#, %&best_period,&curr_period& (const std::string& line) ' std::string date double price iC (-strtk::parse(line,#,#,date,price)) return iC (price < curr_period$buy_price) ' iC (curr_period > best_period) ' best_period " curr_period ( curr_period$buy_date " date curr_period$buy_price " price curr_period$sell_price " std::numeric_limits<double>::min() curr_period$sell_date " ## ( else iC (price > curr_period$sell_price) ' curr_period$sell_price " price curr_period$sell_date " date curr_period$proCit " curr_period$sell_price 4 curr_period$buy_price ( () iC (best_period$buy_price >" best_period$sell_price) ' std::cout << #_o period in time4series can be proCitably e=ploited$# << std::endl ( else ' std::cout << #Iuy: # << best_period$buy_date << std::endl std::cout << #Xell: # << best_period$sell_date << std::endl std::cout << #QroCit per s7are: 0# << best_period$proCit << std::endl ( ( return :

8or the '#ven data #t #s e.$ected the a,ove $#ece o( code w#ll $roduce the (ollow#n' out$ut4
Iuy: :A*8:*;::; Xell: :A*8:*;::? QroCit per s7are: 0?@$<@

Ghat chan'es to the a,ove $#ece o( code would ,e re/u#red #(4


3ather than $r#ces, we #nstead are '#ven $ercenta'e d#((erences (ro) the $rev#ous $r#ce6 Ge have the constra#nt that we can;t hold a $os#t#on (or )ore than ! days A $enalty o( 1<!P #s a$$l#ed to the $ro(#t (or every day a $os#t#on #s held6 (where ! #s the cond#t#on and value descr#,ed a,ove)

-hort sell#n' #s allowed6 Ge want to $rocess #nde$endent sect#ons o( the t#)e ser#es concurrently so as to s$eed u$ overall $rocess#n' t#)e. "he $r#ces could ,e e.tre)ely lar'e or s)all6

Token #rid
-tr"% $rov#des a )eans to eas#ly $arse and consu)e 2D 'r#ds o( to%ens #n an e((#c#ent and s#)$le )anner. A 'r#d #s s#)$ly de(#ned as a ser#es o( rows co)$r#sed o( to%ens, otherw#se %nown as +eli(iter Se arated :alues (+S:). "he #th to%en o( a row #s 'rou$ed w#th every #th to%en o( all other rows to $roduce a colu)n. "o%ens can ,e $rocessed as e#ther rows or colu)ns. An e.a)$le o( a s#)$le to%en 'r#d, where each nu)er#c value #s dee)ed to ,e a to%en4
8$8 8$8 8$8 8$8 8$8 ;$; ;$; ;$; ;$; ;$; <$< <$< <$< <$< <$< E$E E$E E$E E$E E$E 9$9 9$9 9$9 9$9 9$9

A to%en 'r#d can ,e e#ther $assed #n v#a a (#le or a str#n' ,u((er. "he del#)#ters to ,e used (or $ars#n' the colu)ns and rows can also ,e $rov#ded, #( not $rov#ded standard co))on de(aults w#ll ,e used. "he (ollow#n' de)onstrates how each cell #n the 'r#d can ,e access and cast to a s$ec#(#c ty$e.
std::string data " #8,;,<,E,9,F5n# #?,@,A,:,8,;5n# #<,E,9,F,?,@5n# #A,:,8,;,<,E5n# #9,F,?,@,A,:5n# strtk::token_grid grid(data,data$si>e(),#,#) Cor (std::si>e_t r " : r < grid$roD_count() 66r) ' strtk::token_grid::roD_type roD " grid$roD(r) Cor (std::si>e_t c " : c < roD$si>e() 66c) ' std::cout << grid$get<int>(r,c) << '5t' ( std::cout << std::endl (

"he strtk::token_grid $rov#des var#ous hel$er (unct#ons (or travers#n' rows and colu)ns #n ,atch )ode. "he (unct#ons are na)ely4 Cor_eac7_roD that #s used (or #terat#n' e#ther all or a su, ran'e o( rows o( the to%enE'r#d, and Cor_eac7_column that #s used (or #terat#n' e#ther all or a su, ran'e o( colu)ns o( a row.
int main() '

**column : 8 ; < E 9 F std::string data " #8$8,;$8,<$8,E$8,9$8,F$8,?$85n# #8$;,;$;,<$;,E$;,9$;,F$;,?$;5n# #8$<,;$<,<$<,E$<,9$<,F$<,?$<5n# #8$E,;$E,<$E,E$E,9$E,F$E,?$E5n# #8$9,;$9,<$9,E$9,9$9,F$9,?$95n# #8$F,;$F,<$F,E$F,9$F,F$F,?$F5n# #8$?,;$?,<$?,E$?,9$?,F$?,?$?5n# #8$@,;$@,<$@,E$@,9$@,F$@,?$@5n# #8$A,;$A,<$A,E$A,9$A,F$A,?$A5n# strtk::token_grid grid(data,data$si>e(),#,#) '

**roD_: **roD_8 **roD_; **roD_< **roD_E **roD_9 **roD_F **roD_? **roD_@

( '

** Qrocess eac7 roD oC t7e token grid grid$Cor_eac7_roD( %&(const strtk::token_grid::roD_type& roD) ' Cor (std::si>e_t i " : i < roD$si>e() 66i) ' std::cout << roD$get<double>(i) << '5t' ( std::cout << '5n' ()

** Qrocess roDs in t7e range %;,F& oC t7e token grid grid$Cor_eac7_roD(grid$range(;,F), %&(const strtk::token_grid::roD_type& roD) ' Cor (std::si>e_t i " : i < roD$si>e() 66i) ' std::cout << roD$get<double>(i) << '5t' ( std::cout << '5n' () ( ' ** Qrocess eac7 roD and column oC t7e token grid grid$Cor_eac7_roD( %&(const strtk::token_grid::roD_type& roD) ' roD$Cor_eac7_column( %&(const strtk::token_grid::roD_type::range_type& range) ' std::cout << std::string(range$Cirst,range$second) << () std::cout << '5n' ( ' ** Qrocess roDs in t7e range %;,F& and t7e columns in t7e range %8,9& oC t7e token grid grid$Cor_eac7_roD(grid$range(;,F), %&(const strtk::token_grid::roD_type& roD) ' roD$Cor_eac7_column(roD$range(8,9), ()

'5t'

%&(const strtk::token_grid::roD_type::range_type& range) ' std::cout << std::string(range$Cirst,range$second) << '5t' () ( ( return : () std::cout << '5n'

"he (ollow#n' e.a)$le de)onstrates how avera'es over rows and colu)ns o( a to%en 'r#d can ,e co)$uted4
std::string data " #8$8,8$8,8$8,8$8,8$8,8$85n# #;$;,;$;,;$;,;$;,;$;,;$;5n# #<$<,<$<,<$<,<$<,<$<,<$<5n# #E$E,E$E,E$E,E$E,E$E,E$E5n# #9$9,9$9,9$9,9$9,9$9,9$95n# #F$F,F$F,F$F,F$F,F$F,F$F5n# strtk::token_grid grid(data,data$si>e(),#,#) std::vector<double> avg_c(grid$roD(:)$si>e(),:$:) std::vector<double> avg_r(grid$roD_count(),:$:) std::vector<double> tmp(grid$roD(:)$si>e(),:$:) std::Cill(avg_c$begin(),avg_c$end(),:$:) Cor (std::si>e_t i " : i < grid$roD_count() ' grid$roD(i)$parse<double>(tmp$begin()) 66i)

std::transCorm(avg_c$begin(),avg_c$end(),tmp$begin(),avg_c$begin(),std::plu s<double>()) avg_r%i& " std::accumulate(tmp$begin(),tmp$end(),:$:) * tmp$si>e() ( Cor (std::si>e_t i " : i < avg_c$si>e() avg_c%i66& *" grid$roD_count())

std::cout << #Polumn Jverages:5t# std::copy(avg_c$begin(),avg_c$end(),std::ostream_iterator<double>(std::cout ,'5t')) std::cout << '5n' std::cout << #ToD Jverages:5t# std::copy(avg_r$begin(),avg_r$end(),std::ostream_iterator<double>(std::cout ,'5t')) std::cout << '5n'

'rocessing O% Co((a Se arated :alues +ata


"he or#'#nal #ntent o( the to%en 'r#d was to su$$ort (ast and e((#c#ent $rocess#n' o( s#)$le data tu$les, such as co))a se$arated values (*-B) (or)ats et. al. "he (ollow#n' e.a)$le de)onstrates a s#)$le su))at#on o( traded (loor volu)e and avera'e da#ly volu)e ,ased on DA-DAQ 2:5* (2$en :#'h 5ow *lose) data.
**Sate,Xymbol,Kpen,Plose,Wig7,NoD,Oolume

std::string market_data " #;::A:?:8,^KK^,E;E$;:::,E8@$AA::,E;F$E:::,E8@$89::,;<8:?F@5n# #;::A:?:8,VXRT,;E$:9::,;E$:E::,;E$<:::,;<$AF::,9EA898;?5n# #;::A:?:;,^KK^,E89$E8::,E:@$EA::,E89$E8::,E:F$@8::,;98?F<:5n# #;::A:?:;,VXRT,;<$?F::,;<$<?::,;E$:E::,;<$;8::,F9E;?FAA5n# #;::A:?:<,^KK^,E:@$EA::,E:@$EA::,E:@$EA::,E:@$EA::,:5n# #;::A:?:<,VXRT,;<$<?::,;<$<?::,;<$<?::,;<$<?::,:5n# #;::A:?:F,^KK^,E:F$9:::,E:A$F8::,E8:$FE::,E:8$FF::,;;F;99?5n# #;::A:?:F,VXRT,;<$;8::,;<$;:::,;<$;@::,;;$@?::,EA;:?F<@5n# #;::A:?:?,^KK^,E:@$;E::,<AF$F<::,E:A$8A::,<A9$A@:8,<;F:<:?5n# #;::A:?:?,VXRT,;<$:@::,;;$9<::,;<$8E::,;;$EF::,9;@E;E8;5n# #;::A:?:@,^KK^,E::$::::,E:;$EA::,E:F$::::,<A@$:F::,<EE8@9E5n# #;::A:?:@,VXRT,;;$<8::,;;$9F::,;;$FA::,;;:::::,?<:;<<:F5n# #;::A:?:A,^KK^,E:F$8;::,E8:$<A::,E8E$E9::,E:9$@:::,<;?9@8F5n# #;::A:?:A,VXRT,;;$F9::,;;$EE::,;;$@8::,;;$<?::,EFA@88?E5n# #;::A:?8:,^KK^,E:A$9?::,E8E$E:::,E8?$<?::,E:@$?:::,;A;A99A5n# #;::A:?8:,VXRT,;;$8A::,;;$<A::,;;$9E::,;;$89::,E<;<@FA@5n# strtk::token_grid grid(market_data,market_data$si>e(),#,#) struct stock_inCo ' stock_inCo(const std::string& s " # #) : symbol(s), total_volume(:), day_count(:), average_daily_volume(:$:) '( std::string symbol unsigned long long total_volume std::si>e_t day_count double average_daily_volume

stock_inCo goog(#^KK^#) stock_inCo msCt(#VXRT#) static const std::si>e_t volume_column " F static const std::si>e_t symbol_column " 8 goog$day_count " grid$accumulate_column(volume_column, %&(const strtk::token_grid::roD_type& roD) 4> bool '

static const std::string google_symbol(#^KK^#) roD$get<std::string>(symbol_column) return google_symbol "" (, goog$total_volume)

msCt$day_count " grid$accumulate_column(volume_column, %&(const strtk::token_grid::roD_type& roD) 4> bool ' static const std::string microsoCt_symbol(#VXRT#) return microsoCt_symbol "" roD$get<std::string>(symbol_column) (, msCt$total_volume) goog$average_daily_volume " (8$: 3 goog$total_volume) * goog$day_count msCt$average_daily_volume " (8$: 3 msCt$total_volume) * msCt$day_count std::cout << #%^KK^& Total Oolume: # << goog$total_volume << std::endl std::cout << #%VXRT& Total Oolume: # << msCt$total_volume << std::endl std::cout << #%^KK^& JSO: # << goog$average_daily_volume << std::endl std::cout << #%VXRT& JSO: # << msCt$average_daily_volume << std::endl

"he strtk::token_grid #s thread sa(e #(( read o$erat#ons are #n $lay. As such the a,ove calls to accu)ulateEcolu)n et al. can all ,e sa(ely and eas#ly e.ecuted concurrently us#n' threads. "h#s allows (or a (ar )ore e((#c#ent data $rocess#n' )ethodolo'y.

TI,TO2T+I
Play#n' dev#l;s advocate, another way o( $er(or)#n' the a,ove $rocess#n' tas%, assu)#n' only the s$ec#(#c values (or co)$ut#n' the ADB are re/u#red and no (urther $rocess#n' o( the *-B data #s needed, then the $ro,le) can ,e solved e((#c#ently ,y ut#l#&#n' a s#n'le $ass o( the data as (ollows4
std::string Cile_name " #market_data$csv# std::unordered_map<std::string,stock_inCo> stock_map stock_map$insert(std::make_pair<std::string,stock_inCo>(#^KK^#,stock_inCo(# ^KK^#))) stock_map$insert(std::make_pair<std::string,stock_inCo>(#VXRT#,stock_inCo(# VXRT#))) strtk::Cor_eac7_line(Cile_name, %&&(const std::string& line) ' strtk::ignore_token ignore stock_inCo temp const bool result " strtk::parse(line, #,#, ignore, temp$symbol, ignore,

ignore, ignore, ignore, temp$total_volume) iC (-result) return auto itr " stock_map$Cind(temp$symbol) iC (stock_map$end() "" itr) return (3itr)$second$total_volume 6" temp$total_volume (3itr)$second$day_count66

()

auto itr " stock_map$begin() auto end " stock_map$end() D7ile (end -" itr) ' stock_inCo& stock " (3itr66)$second stock$average_daily_volume " (8$: 3 stock$total_volume) * stock$day_count (

TI,TO2T+I - II (;ith a vengeance)


Play#n' the dev#l;s other advocate, the a,ove two e.a)$les, have ,oth re/u#red that the (#lter cond#t#on ,e e.$l#c#tly de(#ned at co)$#le t#)e. :owever even thou'h the cond#t#on )ay,e ,e set #n stone at co)$#le t#)e, so)e o( the underly#n's (such as sy),ol) can ,e en'#neered to ,e )od#(#ed at run t#)e. "hat st#ll doesn;t '#ve us the (reedo) to $er(or) ar,#trar#ly co)$le. (#lter e.$ress#ons deter)#ned at run t#)e. "hat sa#d, an e.tre)ely e((#c#ent and very s#)$le solut#on #s at hand. "he solut#on #s called the *++ D-B 8#lter l#,rary, #t #s ,ased on -tr"! and 0.$r"% l#,rar#es. 7t uses the strtk::token_grid as a *-B<D-B store and #nde., and 0.$r"% as the underly#n' e.$ress#on evaluat#on en'#ne. "he e.a)$le ,elow ta%es the 2:5* )ar%et data ta,le de(#ned a,ove and $er(or)s a row w#se /uery. "he e.$ress#on;s de(#n#t#on #s4 select all rows where the open price is greater than the close price and the symbol matches the wild-card pattern of "3453" and the date is e6ual to or after %(($(#(#.
int main() ' std::string Cile_name " #market_data$csv# dsv_Cilter Cilter Cilter$set_input_delimiter(#,#) iC (-Cilter$load(Cile_name)) return 8 std::string e=pression " #(open > close) and (symbol like '3RT3') and (date >" ';::A:8:8')# Cilter$add_Cilter(Cilter_e=pression) Cor (std::si>e_t roD " 8 roD < Cilter$roD_count() 66roD) ' iC (dsv_Cilter::e_matc7 "" Cilter%roD&) ' ** do somet7ing$$$ ( ( return :

Other e.a( le >ueries/


volume >" 8:::::: and symbol "" '^KK^' abs(open 4 close) > abs(7ig7 4 loD) avg(open,close,7ig7,loD) 3 volume > 8:2? and inrange(';::A:?:;',date,';::A:?<:') (open > close) and (symbol like '3RT3') and (date >" ';::A:8:8')

7t should ,e noted that #n the e.a)$le a,ove, the rows ,e'#n at #nde. 1. "hat #s done ,ecause the dsv_Cilter e.$ects the (#rst row or row at #nde. 0 to ,e a colu)n de(#n#t#on header. "he (or)at o( the colu)n de(#n#t#ons #s to s#)$ly add a su((#. o( >Es> #( the values #n the colu)n are to ,e treated as str#n's or >En> #( they are to ,e treated as nu),ers. Ghen de(#n#n' e.$ress#ons the su((#.es should not ,e #ncluded when #nclud#n' the colu)n na)es. "he a,ove )ent#oned 2:5* csv (#le;s header would ,e as (ollows4
Sate_n,Xymbol_s,Kpen_n,Plose_n,Wig7_n,NoD_n,Oolume_n

C++ +S: 4ilter and +e endencies


D-B 8#lter htt$4<<www.$artow.net<$ro'ra))#n'<dsv(#lter<#nde..ht)l 0.$r"! htt$4<<www.$artow.net<$ro'ra))#n'<e.$rt%<#nde..ht)l

Se>uential 'artitions
A ty$#cal o$erat#on carr#ed out u$on t#)e ser#es data #s to 'rou$ tu$les #nto ,uc%ets (or bins) ,ased u$on the t#)e #nde. value. 8or e.a)$le 'rou$#n' data #nto 2 )#nute ,uc%ets and then $er(or)#n' so)e %#nd o( o$erat#on u$on the 'rou$ed tu$les such as a su))at#on or an avera'e etc. "h#s $rocess #s so)et#)es also called4 "discretization" "he strtk::token_grid class $rov#des a )ethod %nown as seBuential_partition. "he seBuential_partition )ethod re/u#res a Transition 'redicate, a 4unction and o$t#onally a ro;-range. "he Transition 'redicate consu)es a row and returns true only #( the row #s #n the ne.t $art#t#on. All other su,se/uent consecut#ve rows unt#l the trans#t#on $red#cate returns a true are sa#d to ,e #n the current $art#t#on. Pr#or to trans#t#on#n' to a new $art#t#on, the (unct#on $red#cate #s #nvo%ed and $rov#ded w#th the ran'e o( rows #n the current $art#t#on. "he (ollow#n' e.a)$le ta%es a s#)$le t#)e ser#es (time and value), $art#t#ons the tu$les #nto 'rou$s o( 5ime-7uckets o( $er#od len'th 3 and then $roceeds to co)$ute the total su) o( each 'rou$. "he ,elow summari>er class $rov#des $rov#des a "rans#t#on Pred#cate and 8unct#on.

class summari>er ' public: enum column_inde= ' tick_time_column " :, tick_value_column " 8 ( summari>er(std::deBue<double>& sum_value) : ne=t_tick_time_(:), sum_value_(sum_value) '( ** Transition Predicate inline bool operator()(const strtk::token_grid::roD_type& roD) ' iC (roD$get<std::si>e_t>(tick_time_column) >" ne=t_tick_time_) ' ne=t_tick_time_ " roD$get<std::si>e_t>(tick_time_column) 6 < return true ( else return Calse ( ** Function inline bool operator()(const strtk::token_grid& grid, const strtk::token_grid::roD_range_type& range)

' double bucket_sum " :$: iC (-grid$accumulate_column(tick_value_column,range,bucket_sum)) ' std::cout << #Cailed to accumulate-# << std::endl return Calse ( else sum_value_$pus7_back(bucket_sum) return true

private: summari>er& operator"(const summari>er&) std::si>e_t ne=t_tick_time_ std::deBue<double>& sum_value_

int main() '

**time inde=, value std::string data " #8::::,8;<$E9F5n# #8:::8,F8;$<E95n# #8:::;,9F8$;<E5n# #8:::<,E9F$8;<5n# #8:::E,<E9$F8;5n# #8:::9,;<E$9F85n# #8:::F,8;<$E9F5n# strtk::token_grid grid(data,data$si>e(),#,#) std::deBue<double> sum_value summari>er s(sum_value) grid$seBuential_partition(s,s) Cor (std::si>e_t i " : i < sum_value$si>e() 66i) ' std::cout << #bucket%# << i << #& " # << sum_value%i& << std::endl ( return :

"he e.$ected out$ut #s as (ollows4


bucket%:& " 8;A?$:<9 bucket%8& " 8:<F$;AF bucket%;& " 8;<$E9F

'arsing CS: 4iles 2ith )(8edded +ou8le-?uotes


2ne o( the s#)$le e.tens#ons to the *-B (or)at #s the conce$t o( dou,le /uoted to%ens. -uch to%ens )ay conta#n colu)n or row del#)#ters. Ghen such a scenar#o #s encountered, all su,se/uent del#)#ters are #'nored, and %e$t as $art o( the to%en, unt#l the corres$ond#n' clos#n' dou,le /uote #s encountered. "he -tr"% token_grid su$$orts the $ars#n' o( such to%ens. "h#s $ars#n' )ode can ,e eas#ly act#vated v#a the token_grid o$t#on set. Below #s an e.a)$le o( a token_grid load#n' a *-B data set re$resent#n' var#ous a#r$orts (ro) around the world and the#r s$ec#(#c codes and locat#ons, #n wh#ch so)e o( the cells are dou,le /uoted4

7*A2 AHRA BR*2 BSRD *HQ9 0D3! 8A:F 8Q9P !7DF:DD GB!! S-JD

7A"A A#r$ort *#ty *ountry R!A >Roro%a Ratue> Roro%a Pa$ua Dew Ru#nea R*2 >Derler#t 7naat *onsta,le Pynt> >Derler#t 7naat> Rreenland SRD Rodley Auc%land Dew Sealand HQ9 >Rreater 9oncton 7nternat#onal> 9oncton *anada SDB >!o,len& G#nn#n'en> !o,len& Rer)any A:F >:9- Bastard 9e)or#al> !wa&ulu Datal -outh A(r#ca 9SB >9oc#),oa Da Pra#a> >9oc#),oa Da Pra#a> 9o&a),#/ue 7D- >7nd#an -$r#n's A8 AF=> 7nd#an -$r#n's F-A :DD D#%olaevs% >D#%olaevs% Da A)ure> 3uss#a B!7 >!ota !#na,alu 7nternat#onal> !ota !#na,alu 9alays#a JDS >J#n'de&hen A#r$ort> J#n'de&hen *h#na

"he (ollow#n' #s the -tr"% code e.a)$le us#n' token_grid to $arse the a,ove *-B data set4
int main() ' std::string airport_data_Cile_name " #airport_data$csv# strtk::token_grid::options options options$column_delimiters " #| ,# options$support_dBuotes " true strtk::token_grid airport_grid(airport_data_Cile_name,options) ** Cor eac7 roD r, Cor eac7 column c, print cell%r,c& Cor (std::si>e_t r " : r < airport_grid$roD_count() 66r) ' strtk::token_grid::roD_type roD " airport_grid$roD(r) Cor (std::si>e_t c " : c < roD$si>e() 66c) ' std::cout << #%# << roD$get<std::string>(c) << #& # ( std::cout << std::endl ( return : (

).tending +eli(iter 'redicates


As $rev#ously )ent#oned the conce$t o( a del#)#ter ,ased $red#cate can lead to so)e very #nterest#n' solut#ons. A $red#cate as has ,een de(#ned so (ar, w#th the e.ce$t#on o( the o((set $red#cate, has ,een a stateless ent#ty. Add#n' the a,#l#ty to )a#nta#n a state ,ased on what the $red#cate has encountered so (ar can allow #t to ,ehave d#((erently (ro) the s#)$le s#n'le and )ult#$le del#)#ter $red#cates. 8or th#s e.a)$le, lets assu)e a ty$#cal co))and l#ne $ars#n' $ro,le) wh#ch cons#sts o( dou,le /uotat#on )ar% 'rou$#n's and esca$a,le s$ec#al characters, wh#ch can ,e cons#dered ,e#n' dual use as e#ther del#)#ters or data. An e.a)$le #n$ut and out$ut #s as (ollows4

In uts

a,cT>123, )no .y&>,#U,1% Delimiters @s$aceAT,.


Data

Out ut Tokens
a,c Token1 123, )no .y& Token2 #U,1%
Token0

7n order to to%en#&e the a,ove descr#,ed str#n', one can create a co)$os#te $red#cate us#n' a )ult#$le char del#)#ter $red#cate and so)e s#)$le state rules. "he (ollow#n' #s an e.a)$le o( such an e.tended $red#cate4
class e=tended_predicate ' public: e=tended_predicate(const std::string& delimiters) : escape_(Calse), in_bracket_range_(Calse), mdp_(delimiters) '( inline bool operator()(const unsigned c7ar c) const ' iC (escape_) ' escape_ " Calse return Calse ( else iC ('55' "" c) ' escape_ " true return Calse ( else iC ('#' "" c) ' in_bracket_range_ " -in_bracket_range_ return true ( else iC (in_bracket_range_) return Calse else return mdp_(c) ( inline void reset() ' escape_ " Calse in_bracket_range_ " Calse ( private: mutable bool escape_ mutable bool in_bracket_range_

mutable strtk::multiple_c7ar_delimiter_predicate mdp_ (

Fsa'e o( the newly de(#ned e.tended $red#cate #s as (ollows4


int main() ' std::string str " #abc 5#8;<, mno =y>5#,i55,Gk# strtk::std_string::token_list_type token_list strtk::split(e=tended_predicate(#$, #), str, std::back_inserter(token_list), strtk::split_options::compress_delimiters) return : (

=igh 'er%or(ance @e$-:alue 'arsing


"a%#n' our $rev#ous person struct as an e.a)$le. 7t #s clear that the tu$le (or)at has to ,e very s$ec#(#c w#th re'ards to the order#n' o( data. 8or )ost s#tuat#ons th#s #s acce$ta,le as the ser#al#&ers and deser#al#&ers atte)$t to (unct#on #n the s#)$lest )anner $oss#,le, however so)et#)es $#eces o( #n(or)at#on )ay co)e #n d#((erent order#n's, or )ay ,e dee)ed optional and hence not ,e $resent #n a $art#cular tu$le o( data. An a$$roach #s re/u#red that $rov#des the )eans o( )a$$#n' s$ec#(#c $#eces o( data to the corres$ond#n' var#a,les (or members) that w#ll store or )a%e sense o( the), #n a very e((#c#ent and s#)$le )anner. 8or clar#ty $ur$oses, the ter) >)a$$#n'> here )eans to $o$ulate the des#red )e),er w#th the '#ven data. "he assoc#at#on ,etween the two ,e#n' )ade w#th the %ey that #s $a#red #n l#ne w#th the data. !ey Balue $a#rs or so)e t#)es %nown as 8ttribute9alue $a#rs are one o( the )eans ,y wh#ch such an assoc#at#on can ,e acco)$l#shed. "he (ollow#n' #s a d#a'ra) that de)onstrates the )a$$#n' o( var#ous (#elds o( a data struct to the#r corres$ond#n' data ele)ents #n a tu$le co)$r#sed o( %ey value $a#rs.

7n the d#a'ra) a,ove, the %ey value $a#rs are se$arated (delimited) ,y the pipe symbol >?>. G#th re'ards to the %ey value $a#rs the)selves, the %ey #s trad#t#onally the (#rst ele)ent #n the $a#r and the value #s the second ele)ent, they are se$arated #n th#s case ,y a s#n'le e6ual sign >L>.

Bac% to our or#'#nal $ro,le) relat#n' to the person struct. 7( we were to add %eys to each o( the $#eces o( data, we could then not only $arse a tu$le o( data re$resent#n' the var#ous (#elds o( the person struct, ,ut those (#elds could ,e any order w#th#n the tu$le. 8urther)ore any o( the (#elds could ,e dee)ed o$t#onal, hence not necessar#ly ,e $resent #n the tu$le. 7t should ,e noted that what would denote a correct or success(ul $arse o( tu$le )ay not only de$end u$on success(ul $ars#n' o( ran'es #nto the var#ous ty$es, ,ut #t )ay also de$end u$on the )andatory $resence o( certa#n (#elds. 2ur o,1ect#ve w#ll ,e to success(ully and #n the )ost e((#c#ent way $oss#,le $arse the (ollow#n' l#st o( tu$les that re$resent #nstances o( our person struct.
Tuple: " #!S":=RJ<?HS8;|_JVH"Tumpelstiltskin|J^H"<A?|WH!^WT"8$<8| MH!^WT"9@$?# Tuple8 " #_JVH"Tumpelstiltskin|J^H"<A?|WH!^WT"8$<8|MH!^WT"9@$?| !S":=RJ<?HS8;# Tuple; " #J^H"<A?|WH!^WT"8$<8|MH!^WT"9@$?|!S":=RJ<?HS8;| _JVH"Tumpelstiltskin# Tuple< " #WH!^WT"8$<8|MH!^WT"9@$?|!S":=RJ<?HS8;|_JVH"Tumpelstiltskin| J^H"<A?# TupleE " #MH!^WT"9@$?|!S":=RJ<?HS8;|_JVH"Tumpelstiltskin|J^H"<A?| WH!^WT"8$<8# Tuple9 " #!S":=RJ<?HS8;|MH!^WT"9@$?|J^H"<A?|_JVH"Tumpelstiltskin| WH!^WT"8$<8# TupleF " #MH!^WT"9@$?|J^H"<A?|_JVH"Tumpelstiltskin|WH!^WT"8$<8| !S":=RJ<?HS8;# Tuple? " #J^H"<A?|_JVH"Tumpelstiltskin|WH!^WT"8$<8|!S":=RJ<?HS8;| MH!^WT"9@$?# Tuple@ " #_JVH"Tumpelstiltskin|WH!^WT"8$<8|!S":=RJ<?HS8;|MH!^WT"9@$?| J^H"<A?# TupleA " #WH!^WT"8$<8|!S":=RJ<?HS8;|MH!^WT"9@$?|J^H"<A?| _JVH"Tumpelstiltskin#

-tr"% $rov#des a )eans to ach#eve the a,ove %ey value $a#r $ars#n' tas%, na)ely v#a the strtk::keyvalue::parser and assoc#ated %ey )a$$ers. "he (ollow#n' #s a ta,le that de$#cts the var#ous %#nd o( %ey to value )a$$ers that are ava#la,le #n the -tr"% l#,rary4

@e$ To :alue ,a
9a$$er

ers
"y$e !ey 5oo%u$ *o)$le.#ty O(log(n)) O(1) 9a.#)u) -#&e 5#)#ted to ava#la,le )e)ory 5#)#ted to e.$ected )a.#)u) %ey value

strtk::keyvalue::stringkey_map std::string strtk::keyvalue::uintkey_map cardinal value

7n the code ,elow we #n#t#ally ,e'#n ,y de(#n#n' the del#)#ters we e.$ect to see ,etween $a#rs o( %ey values and #n ,etween the %ey and value $a#rs the)selves. 7n the e.a)$le ,elow the pair_block_delimiter (#eld denotes the del#)#ter we e.$ect ,etween $a#rs (or blocks) o( %ey values and the (#eld pair_delimiter denotes the del#)#ter we e.$ect ,etween a $art#cular %ey and value.

De.t we de(#ne $ as an #nstance o( the person struct and re'#ster each o( the )e),ers o( the #nstance $ w#th a corres$ond#n' %ey w#th the keyvalue::parser. A(ter wh#ch we then $rocess each tu$le o( data, $ars#n' the tu$le, $o$ulat#n' the #nstance $, then $r#nt#n' out the var#ous (#elds to stdout.
struct person ' unsigned int id std::string name unsigned int age double 7eig7t Cloat Deig7t (

** ** ** ** **

key key key key key

" " " " "

!S _JVH J^H WH!^WT MH!^WT

int main() ' const std::string person_data%& " ' #!S":=RJ<?HS8;|_JVH"Tumpelstiltskin|J^H"<A?|WH!^WT"8$<8| MH!^WT"9@$?#, #_JVH"Tumpelstiltskin|J^H"<A?|WH!^WT"8$<8|MH!^WT"9@$?| !S":=RJ<?HS8;#, #J^H"<A?|WH!^WT"8$<8|MH!^WT"9@$?|!S":=RJ<?HS8;| _JVH"Tumpelstiltskin#, #WH!^WT"8$<8|MH!^WT"9@$?|!S":=RJ<?HS8;|_JVH"Tumpelstiltskin| J^H"<A?#, #MH!^WT"9@$?|!S":=RJ<?HS8;|_JVH"Tumpelstiltskin|J^H"<A?| WH!^WT"8$<8#, #!S":=RJ<?HS8;|MH!^WT"9@$?|J^H"<A?|_JVH"Tumpelstiltskin| WH!^WT"8$<8#, #MH!^WT"9@$?|J^H"<A?|_JVH"Tumpelstiltskin|WH!^WT"8$<8| !S":=RJ<?HS8;#, #J^H"<A?|_JVH"Tumpelstiltskin|WH!^WT"8$<8|!S":=RJ<?HS8;| MH!^WT"9@$?#, #_JVH"Tumpelstiltskin|WH!^WT"8$<8|!S":=RJ<?HS8;|MH!^WT"9@$?| J^H"<A?#, #WH!^WT"8$<8|!S":=RJ<?HS8;|MH!^WT"9@$?|J^H"<A?| _JVH"Tumpelstiltskin#, ( const std::si>e_t person_data_si>e " si>eoC(person_data) * si>eoC(std::string) **Iasic typedeC typedeC kvp_type typedeC type deCinition unsigned c7ar c7ar_type strtk::keyvalue::parser<strtk::keyvalue::stringkey_map> strtk::keyvalue::options<c7ar_type> opts_type

**Xetup t7e various delimiters opts_type options options$pair_block_delimiter " '|' options$pair_delimiter " '"' kvp_type kvp(options) person p strtk::7e=_to_number_sink<unsigned int> 7;ns(p$id)

**SeCine t7e mapping betDeen t7e key and t7e **members oC t7e person struct kvp$register_keyvalue(#!S# , 7;ns) kvp$register_keyvalue(#_JVH# , p$ name) kvp$register_keyvalue(#J^H# , p$ age) kvp$register_keyvalue(#WH!^WT#, p$7eig7t) kvp$register_keyvalue(#MH!^WT#, p$Deig7t) **^o t7roug7 eac7 tuple Cor (std::si>e_t i " : i < person_data_si>e 66i) ' **!C parsing oC t7e it7 tuple is successCul **print t7e person struct out$ iC (kvp(person_data%i&)) ' std::cout << i << '5t' << #!S: # << p$id << '5t' << #_ame: # << p$name << '5t' << #J^H: # << p$age << '5t' << #WH!^WT: # << p$7eig7t << '5t' << #MH!^WT: # << p$Deig7t << '5n' ( ( return : (

7t should ,e noted that the underly#n' assoc#at#ve conta#ner (or the strtk::keyvalue::stringkey_map can ,e e.$l#c#tly s$ec#(#ed at co)$#le t#)e. By de(ault #t #s std::map, however #t can ,e eas#ly chan'ed to std::unordered_map '#v#n' #t a %ey loo%u$ co)$le.#ty o( O(1) or #n (act any other conta#ner that #s co)$at#,le w#th -"5 assoc#at#ve )a$ se)ant#cs. "he (ollow#n' are three e.a)$les o( %eyvalueE$arsers s$ec#al#&ed us#n' std::unordered_map, the (#rst ,e#n' ,ased on the std::string ty$e, the second ,e#n' ,ased on the int ty$e and the th#rd ,e#n' ,ased on the double ty$e4
int main() ' ** std::string type key to value mapper based on std::unordered_map ' **Iasic type deCinition typedeC unsigned c7ar c7ar_type typedeC std::unordered_map<std::string,strtk::util::value> key_to_val_map_t typedeC strtk::keyvalue::key_map<std::string,key_to_val_map_t> string7as7key_mapper typedeC strtk::keyvalue::parser<string7as7key_mapper> kvp_type typedeC strtk::keyvalue::options<c7ar_type> opts_type **Xetup t7e various delimiters opts_type options options$pair_block_delimiter " '|' options$pair_delimiter " '"' kvp_type kvp(options) ( ** int type key to value mapper based on std::unordered_map ' **Iasic type deCinition

typedeC unsigned c7ar c7ar_type typedeC std::unordered_map<int,strtk::util::value> key_to_val_map_t typedeC strtk::keyvalue::key_map<int,key_to_val_map_t> int7as7_key_mapper typedeC strtk::keyvalue::parser<int7as7_key_mapper> kvp_type typedeC strtk::keyvalue::options<c7ar_type> opts_type **Xetup t7e various delimiters opts_type options options$pair_block_delimiter " '|' options$pair_delimiter " '"' kvp_type kvp(options) ( ** double type key to value mapper based on std::unordered_map ' **Iasic type deCinition typedeC unsigned c7ar c7ar_type typedeC std::unordered_map<double,strtk::util::value> key_to_val_map_t typedeC strtk::keyvalue::key_map<double,key_to_val_map_t> double7as7_key_mapper typedeC strtk::keyvalue::parser<double7as7_key_mapper> kvp_type typedeC strtk::keyvalue::options<c7ar_type> opts_type **Xetup t7e various delimiters opts_type options options$pair_block_delimiter " '|' options$pair_delimiter " '"' ( ( kvp_type kvp(options)

return :

@e$-:alue 'airs And ,ulti le +istinct +ata Structures


"o (urther the $rev#ous e.a)$le, one )#'ht have a s#tuat#on where a tu$le conta#ns values (or d#((erent )e),ers o( d#((erent ty$es. "he (ollow#n' e.a)$le de)onstrates $ars#n' o( %ey value $a#rs that )a$ to )e),ers o( )ult#$le ty$es. "he tu$le #n the e.a)$le cons#sts o( values so)e o( wh#ch are #ntended (or the #nstance o( data1 and others wh#ch are #ntended (or the #nstance o( data2.
struct data8 ' int =: std::string =8 double =; Cloat =< ( struct data; ' int y: std::string y8 double y; Cloat y<

( int main() ' typedeC unsigned c7ar c7ar_type typedeC strtk::keyvalue::parser<strtk::keyvalue::stringkey_map> kvp_type typedeC strtk::keyvalue::options<c7ar_type> opts_type opts_type options options$pair_block_delimiter " '|' options$pair_delimiter " '"' kvp_type kvp(options) data8 d8 data; d; std::string data " #S8Y:"8;<E9F?@A|S8Y8"JbPdHC^8;<E9F|S8Y;"8;<$E9F| S8Y<":8:8$:;:;C|# #S;U:"A@?F9E<;8|S;U8"tLvM=U>?@A:8;|S;U;"?@A$:8;| S;U<":?:?$:A:AC# **Tegister values Cor data8 type kvp$register_keyvalue(#S8Y:#, d8$=:) kvp$register_keyvalue(#S8Y8#, d8$=8) kvp$register_keyvalue(#S8Y;#, d8$=;) kvp$register_keyvalue(#S8Y<#, d8$=<) **Tegister values Cor data; type kvp$register_keyvalue(#S;U:#, d;$y:) kvp$register_keyvalue(#S;U8#, d;$y8) kvp$register_keyvalue(#S;U;#, d;$y;) kvp$register_keyvalue(#S;U<#, d;$y<) **Qarse t7e tuple kvp(data) std::cout << << << << std::cout << << << << return : ( #S8Y:: #S8Y8: #S8Y;: #S8Y<: #S;U:: #S;U8: #S;U;: #S;U<: # # # # # # # # << << << << << << << << d8$=: d8$=8 d8$=; d8$=< d;$y: d;$y8 d;$y; d;$y< << << << << << << << << '5t' '5t' '5t' '5n' '5t' '5t' '5t' '5n'

@e$-:alue 'airs And 3ists


"he values #n the %ey value $a#rs need not always ,e s#n'ular. 7n so)e scenar#os, a value could ,e a l#st o( values. "he -tr"% %ey value $arser su$$orts $ars#n' o( such %ey value $a#rs throu'h the $rev#ously de)onstrated se6uence sink )echan#s)s. 7n the (ollow#n' e.a)$le we have a comple=_data struct that cons#sts o( so)e P2Ds ,ut also a nu),er o( se/uences (vector!de6ue!list). "he tu$le to ,e $arsed cons#sts o( s#)$le %ey value

$a#rs (or the P2Ds and )ore co)$le. loo%#n' $a#rs where the values (or the s$ec#(#c se/uence o( values are se$arated ,y commas >,>.
struct comple=_data ' unsigned int std::vector<int> double std::deBue<double> std::string std::list<std::string> (

v: v8 v; v< vE v9

int main() ' typedeC unsigned c7ar c7ar_type typedeC strtk::keyvalue::parser<strtk::keyvalue::stringkey_map> kvp_type typedeC strtk::keyvalue::options<c7ar_type> opts_type opts_type options options$pair_block_delimiter " '|' options$pair_delimiter " '"' kvp_type kvp(options) comple=_data cd std::string data " #O:"8;<E9F?@A|O8"4<,4;,48,:,68,;6<|O;"888$;;;| O<"8$8,;$;,<$<,E$E,9$9|# #OE"Xome Te=t|O9"Te=t8,Te=t;,Te=t<,Te=tE# strtk::vector_sink<int>::type vec_sink(#,#) strtk::deBue_sink<double>::type deB_sink(#,#) strtk::list_sink<std::string>::type lst_sink(#,#) **Tegister values Cor data8 kvp$register_keyvalue(#O:#, kvp$register_keyvalue(#O8#, kvp$register_keyvalue(#O;#, kvp$register_keyvalue(#O<#, kvp$register_keyvalue(#OE#, kvp$register_keyvalue(#O9#, kvp(data) std::cout << << << << << << ( return : #O:: #O8: #O;: #O<: #OE: #O9: # # # # # # << << << << << << cd$v: << '5n' strtk::Goin(# #,cd$v8) << '5n' cd$v; << '5n' strtk::Goin(# #,cd$v<) << '5n' cd$vE << '5n' strtk::Goin(# #,cd$v9) << '5n' type cd$v:) vec_sink(cd$v8)) cd$v;) deB_sink(cd$v<)) cd$vE) lst_sink(cd$v9))

@e$-:alue 'airs 2ith Cardinal @e$s

-o (ar all the e.a)$les have assu)ed %eys o( ar,#trary values. 7n so)e s#tuat#ons such as the 87= $rotocol, the %eys are always 'uaranteed to ,e $os#t#ve #nte'er values. 7( th#s #s the case then a d#((erent %#nd o( %ey )a$$er can ,e used that #s )uch )ore e((#c#ent than the 'eneral $ur$ose str#n' %ey )a$$er, $rov#d#n' a %ey loo%u$ co)$le.#ty o( 2(1). As such -tr"% $rov#des the strtk::keyvalue::uintkey_map ty$e (or th#s $ur$ose. "he only d#((erence #n ter)s o( sett#n' u$ the uintkey_map (ro) the stringkey_map #s that #t re/u#res a key_count to ,e set. "h#s value re$resents the lar'est $oss#,le %ey value that can e.#st. "he (ollow#n' #s an e.a)$le o( how the strtk::keyvalue::uintkey_map %ey )a$$er can ,e used.
struct data_store ' c7ar c unsigned c7ar uc s7ort s unsigned s7ort us int i std::string str (

** ** ** ** ** **

key key key key key key

" " " " " "

8;8 8;; 8;< 8;E 8;9 8;F

int main() ' const std::string data " #8;8"J|8;;">|8;<"48;<|8;E"E9F|8;9"48;<E9F?@| 8;F"Xome simple te=t# typedeC strtk::keyvalue::parser<strtk::keyvalue::uintkey_map> kvp_type strtk::keyvalue::uintkey_map::options options options$key_count " 8;? options$pair_block_delimiter " '|' options$pair_delimiter " '"' kvp_type kvp(options) data_store ds kvp$register_keyvalue(8;8,ds$ c) kvp$register_keyvalue(8;;,ds$ uc) kvp$register_keyvalue(8;<,ds$ s) kvp$register_keyvalue(8;E,ds$ us) kvp$register_keyvalue(8;9,ds$ i) kvp$register_keyvalue(8;F,ds$str) kvp(data) return : ( **%:,8;F& 44> 8;? keys

O tional @e$-:alue 'airs


As $rev#ously )ent#oned there #s a use case where certa#n (#elds )ay ,e dee)ed o$t#onal and hence the#r a,sence would not const#tute a $ars#n' error. "hat sa#d, #t would also ,e ,ene(#c#al to %now #( a $art#cular (#eld has ,een $o$ulated or not once the se/uence o( %ey value $a#rs has ,een co)$letely $arsed and )a$$ed. -tr"% $rov#des such (unct#onal#ty thou'h the use o( the strtk::util::attribute ty$e.

"he strtk::util::attribute acts as a $ro.y (or the underly#n' ty$e wh#ch #t #s s$ec#al#&ed u$on. 7t $rov#des a convers#on cast to the underly#n' ty$e, and also )a#nta#ns an ;#n#t#al#sed; state value that can ,e used to /uery the attr#,ute a,out the underly#n' ty$e;s #n#t#al#sed status. "he (ollow#n' #s an e.a)$le o( how one could use the strtk::util::attribute ty$e #n con1unct#on w#th the keyvalue::parser4
struct data_store ' strtk::util::attribute<int> d8 strtk::util::attribute<unsigned int> d; strtk::util::attribute<double> d< strtk::util::attribute<Cloat> dE strtk::util::attribute<std::string> d9 ( int main() ' std::string data " #!_T"48;<E|L!_T"69F?@|SKLINH"8;<E$9F?@| RNKJT"A:8;<$E9F?C|XTT!_^"Xome simple te=t# typedeC unsigned c7ar c7ar_type typedeC strtk::keyvalue::parser<strtk::keyvalue::stringkey_map> kvp_type typedeC strtk::keyvalue::options<c7ar_type> opts_type opts_type options options$pair_block_delimiter " '|' options$pair_delimiter " '"' kvp_type kvp(options) data_store ds d8$initialised() d;$initialised() d<$initialised() dE$initialised() d9$initialised() " " " " " Calse Calse Calse Calse Calse ds$d8) ds$d;) ds$d<) ds$dE) ds$d9)

kvp$register_keyvalue(#!_T# , kvp$register_keyvalue(#L!_T# , kvp$register_keyvalue(#SKLINH#, kvp$register_keyvalue(#RNKJT# , kvp$register_keyvalue(#XTT!_^#,

iC (-kvp(data)) ' std::cout << #Railed to parse key4value data: # << data << std::endl return 8 ( iC (-d8$initialised()) std::endl ( iC (-d;$initialised()) std::endl ( iC (-d<$initialised()) std::endl ( iC (-dE$initialised()) std::endl ( ' std::cout << #d8 7as not been initialised$# << ' std::cout << #d; 7as not been initialised$# << ' std::cout << #d< 7as not been initialised$# << ' std::cout << #dE 7as not been initialised$# <<

iC (-d9$initialised()) ' std::cout << #d9 7as not been initialised$# << std::endl ( ( return :

Se(antic Actions ;ith @e$-:alue 'airs


"here )ay ,e t#)es that when %ey value $a#rs are ,e#n' $arsed certa#n act#ons need to ,e e.ecuted or ,ehav#ours e.h#,#ted #n s#tu w#th the $ars#n' $rocess. As $rev#ously )ent#oned, -tr"% $rov#des the ty$e semantic_action that can act as a $ro.y (or a 'ener#c ty$e dur#n' the $ars#n' $rocess that also ta%es a (unctor or la),da and e.ecutes #t at the convers#on call. "he (ollow#n' e.a)$le de)onstrates the $ars#n' o( an array o( tu$les co)$r#sed o( %ey value $a#rs that )a$ to )e),ers o( a struct na)ely data_store. "he %eys 111, 222 and 333 each re$resent a s$ec#(#c value ty$e, they also re/u#re a certa#n ,ehav#or to ,e e.h#,#ted. 7n th#s e.a)$le, (or s#)$l#c#ty, as the values o( the var#ous %eys are ,e#n' $arsed, a s#)$le )essa'e w#ll ,e $r#nted to the console denot#n' the nature o( the $ars#n' $rocess. "he code #s as (ollows4
struct data_store ' data_store() : i8(:), d8(:$:), s8(##) '( int i8 double d8 std::string s8 ( int main() ' static const std::string data%& " ' #888"48;<E9|;;;"8A@?$F9E<;8|<<<"Jn interesting string:#, **Tuple: #;;;";A@?$F9E<;8|<<<"Jn interesting string8| 888"68;<E9#, **Tuple8 #<<<"Jn interesting string;|888"48;<E9| ;;;"<A@?$F9E<;8#, **Tuple; #;;;"EA@?$F9E<;8|888"68;<E9|<<<"Jn interesting string<#, **Tuple< #<<<"Jn interesting stringE|;;;"9A@?$F9E<;8|888"4 8;<E9#, **TupleE #888"68;<E9|<<<"Jn interesting string9| ;;;"FA@?$F9E<;8#, **Tuple9 ( static const std::si>e_t data_si>e " si>eoC(data) * si>eoC(std::string) typedeC strtk::keyvalue::parser<strtk::keyvalue::uintkey_map> kvp_type strtk::keyvalue::uintkey_map::options options options$key_count " <<E **%888,<<<& ** key " 888 ** key " ;;; ** key " <<<

options$pair_block_delimiter " '|' options$pair_delimiter " '"' kvp_type kvp(options) data_store ds typedeC strtk::range::ustring::const_iterator itr_type using strtk::util::semantic_action **Qarsing action Cor key 888 auto lambda_key888 " %&ds&(itr_type begin,itr_type end) 4> bool ' iC (strtk::string_to_type_converter(begin,end,ds$i8)) ' std::cout << #Railed to parse value Cor key"8885n# return Calse ( std::cout << #key%888& value"# << ds$i8 << std::endl return true ( **Qarsing action Cor key ;;; auto lambda_key;;; " %&ds&(itr_type begin,itr_type end) 4> bool ' iC (strtk::string_to_type_converter(begin,end,ds$d8)) ' std::cout << #Railed to parse value Cor key";;;5n# return Calse ( std::cout << #key%;;;& value"# << ds$d8 << std::endl return true ( **Qarsing action Cor key <<< auto lambda_key<<< " %&ds&(itr_type begin,itr_type end) 4> bool ' iC (strtk::string_to_type_converter(begin,end,ds$s8)) ' std::cout << #Railed to parse value Cor key"<<<5n# return Calse ( std::cout << #key%<<<& value"# << ds$s8 << std::endl return true ( kvp$register_keyvalue(888,semantic_action(lambda_key888)$reC()) kvp$register_keyvalue(;;;,semantic_action(lambda_key;;;)$reC()) kvp$register_keyvalue(<<<,semantic_action(lambda_key<<<)$reC()) Cor (std::si>e_t i " : ' i < data_si>e 66i)

iC (-kvp(data%i&)) std::cout << #Hrror D7ile parsing tuple: # << i << std::endl std::cout << std::endl ( ( return :

*ote/ 7t should ,e noted that se)ant#c act#ons dur#n' the $ars#n' $rocess have a )ult#tude o( uses, so)e o( wh#ch are4 val#dat#on o( $arsed values, that #s )a%#n' sure that they;re #n a s$ec#(#ed ran'e or w#th#n a $rede(#ned set o( values, co)$le. $arsed value )an#$ulat#ons, #nvo%#n' o( e.ternal state )ach#nes to trans#t#on to new states ,ased on the $arsed value or even s#)$ly the $resence o( a %ey etc.

The 3etters #a(e


7n the $o$ular "B 'a)e show *ountdown (a%a 5etters and Du),ers), contestants #n the 5etters round ta%e turns choos#n' letters (ro) e#ther a vowel or consonant ,#n. "y$#cally u$ to 9 letters are chosen, a(ter wh#ch the contestants are '#ven a certa#n a)ount o( t#)e (usually *( seconds) to (#nd the lon'est ;val#d; 0n'l#sh word )ade u$ o( only the letters that had ,een chosen. "he contestant w#th the lon'est val#d word w#ns the round. Ghat de(#nes a val#d word #s usually s$ec#(#c to the vers#on o( the show. -o)e e.a)$les are us#n' the 20D or the 9ac/uar#e d#ct#onary cou$led to'ether w#th rules related to $ro$er nouns, $lurals and co),#nat#on words. "he 5etters round #s essent#ally an ana'ra) solv#n' challen'e. "he 5etters 'a)e can ,e 'enerally de(#ned as4 R#ven a canon#cal set o( su,str#n's o( vary#n' len'th called *, 'enerated (ro) the al$ha,et A, and a su,set o( not necessar#ly un#/ue ele)ents der#ved (ro) A called D, (#nd the lon'est su,str#n' #n the set * that #s also #n the set o( the 2?D? un#/ue co),#nat#ons 'enerated (ro) the set D. A ty$#cal $rocess (or solv#n' the 5etters $ro,le) #s as (ollows4

Ste 0 A l#st o( un#/ue val#d words #s s$ec#(#ed (eg: /,0 or :ord ;ist) o Ste 0.0 "he lon'est word len'th (ro) the l#st o( words #s noted as ,e#n' 5
o o

Ste 0.1 0ach word #s nor)al#sed to a co))on case. (eg: all lower case) Ste 0.! -tore the len'th o( the words #nto a >word len'th> set

Ste 1 8or every word w#, 'enerate the %ey %# ,y sort#n' the letters o( the word. (eg: 4or the word <english<! the key <eghilns< is derived.) Ste ! 7nsert each %ey<word $a#r #nto an assoc#at#ve )a$ 9 (eg: hash-table) Ste " A l#st o( $otent#ally non un#/ue letters o( len'th D #s s$ec#(#ed co)$r#sed o( ,oth consonants and vowels
o

Ste ".0 5e.#co'ra$h#cally order th#s l#st o( letters

Ste & 8or every D choose 7 co),#nat#ons over the l#st o( D letters, where = starts (ro) )#n(5,D) and tends to 14
o

Ste &.0 7( no words o( len'th = e.#st $roceed to the ne.t value o( =

o o

Ste &.1 "est the current co),#nat#on . as a %ey #n 9 Ste &.! 7( . e.#sts #n 9, add all the words assoc#ated w#th . #nto a solut#on l#st -, Ste &." 2therw#se cont#nue on w#th the ne.t co),#nat#ons and values o( = Ste &.& 2nce all the co),#nat#ons (or the current 7 have ,een enu)erated $resent the solut#on l#st - (if it is not empty)

o o

int main() ' std::string Dord_list_Cile_name " #Dord_list$t=t# typedeC std::unordered_multimap<std::string,std::string> Dord_map_t typedeC std::pair<Dord_map_t::iterator,Dord_map_t::iterator> Dord_map_range_t Dord_map_t Dord_map std::set<unsigned int> Dord_lengt7 std::si>e_t longest_Dord " : **Noad t7e Dord map as per steps : and 8 strtk::Cor_eac7_line(Dord_list_Cile_name, %&&(const std::string& Dord) ' iC (Dord$empty()) return strtk::remove_leading_trailing(# 5t5n5r#,Dord) **^enerate t7e key Cor t7e speciCied Dord$ strtk::convert_to_loDercase(Dord) std::string key " Dord strtk::sort(key) Dord_map$insert(std::make_pair(key,Dord)) longest_Dord " std::ma=(longest_Dord,Dord$si>e()) Dord_lengt7$insert(Dord$si>e()) () std::string letters Cor ( ) ' **Noad, prepare and validate t7e game letters std::cout << #Hnter letters: # iC(-std::getline(std::cin,letters)) break static const std::string illegal_c7ars " strtk::e=t_string::all_c7ars() 4 strtk::e=t_string::all_letters() strtk::multiple_c7ar_delimiter_predicate pred(illegal_c7ars) strtk::remove_inplace(pred,letters) iC (letters$empty()) break strtk::convert_to_loDercase(letters) strtk::sort(letters) const std::si>e_t upper_bound " std::min(longest_Dord,letters$si>e())

std::unordered_set<std::string> solution_list Cor (std::si>e_t i " upper_bound i > : 44i) ' iC (Dord_lengt7$end() "" Dord_lengt7$Cind(i)) continue typedeC std::string::iterator str_itr_t **Hnumerate all _4c7oose4! combinations as per step E strtk::Cor_eac7_combination(letters$begin(),letters$end(), i, %&&(str_itr_t begin, str_itr_t end) ' std::string key(begin,end) Dord_map_range_t itr_range " Dord_map$eBual_range(key) iC (: "" strtk::distance(itr_range)) return auto itr " itr_range$Cirst D7ile (itr_range$second -" itr) ' solution_list$insert(itr4 >second) 66itr ( () step E$E **Qresent t7e solution list iC solutions 7ave been Cound as per iC (-solution_list$empty()) ' std::copy(solution_list$begin(), solution_list$end(), std::ostream_iterator<std::string>(std::cout,#5n#)) break (

( ( (

return :

*otes On The 3etters -ound Solution


"he t#)e co)$le.#ty o( the '#ven solut#on #s 2(2)#n(5,D)), wh#ch #s /u#te lar'e. Ghat )a%es the solut#on $ract#cal, #s the (act that natural lan'ua'es such as 0n'l#sh tend to have short co))on words der#ved (ro) relat#vely s)all al$ha,ets, w#th an u$$er ran'e len'th o( a,out len'th 10 12 characters (e cluding names et al and of course .neumonoultramicroscopicsilicovolcanoconiosis). -o (or e.a)$le an D o( 10 or even 20 (assu)#n' 5 #s ade/uately lar'e), w#ll only a)ount to a total o( 1024 and 104CNOV co),#nat#ons res$ect#vely and ,oth search s$aces can ,e tr#v#ally enu)erated #n a ,rute (orce )anner #n a )ere (ract#on o( a )#ll#second us#n' )odern hardware. :owever the $ro,le) s$ace ,eco)es daunt#n' at around D o( V4 and lar'er. At wh#ch $o#nt a constant )ult#$l#er can ,e a$$l#ed ,y d#str#,ut#n' the enu)erat#on $rocess and $er(or)#n' sa#d co)$utat#ons concurrently. Dote th#s w#ll not reduce the overall co)$le.#ty o( the

solut#on, 1ust the t#)e #t w#ll ta%e to co)$lete, (urther)ore today th#s techn#/ue )ay only ,e $ract#cal (or values o( D less than VO. 2ne (#nal note, the a,ove $rocess w#ll not only $rov#de the (#rst solut#on #t encounters, #t w#ll return all $oss#,le solut#ons (or lar'est encountered len'th co),#nat#on.

'ast 3etters -ound #a(es


"he (ollow#n' #s a short l#st o( the 5etters round 'a)es $layed on countdown dur#n' the 2010 season.
TXSTKH!TK TT_XHJH_! IQX]HJHO_ !^TUHJTT_ SXKHYSJNV TTLPHH_SX WT_Y!LKJS _USJKLTJL X^NJHKVO! ITKJSVKKT SHNJMHTJS RX^T!KJKO NHXJSHQ!N UTJ!_TKTS NNILH!XLT ^HVLWTK_J ^KYHTJV[J _TTT!KJHQ RNLH_TN!Q XPTX!HK_L WQJOXKHSJ M\P[TKHJV XV_T!H!L^ HRPJKTT_J W^T_HK!TH T^Q_!HL]J TTHLNSHKR X_TTHL!SJ TTHLNSHKR X_TTHL!SJ TTLPHH_SX WT_Y!LKJS XMSH!LWIH _MP\QJ!JH ^N]^H!KSO TTNHKHXT! [PTTLJHKX NXSXHJHRK TTVJHJXT! _RT_K!JHX

'er%or(ance Co( arisons


"he (ollow#n' are ta,les o( results 'enerated ,y runn#n' the strtk9tokenizer9c( test. *urrently #t covers s#)$le co)$ar#sons ,etween Boost -tr#n' Al'or#th)s, Boost le.#calEcast, "he -tandard 5#,rary, -$#r#t (!ar)a Q#) and -tr"% #n the (ollow#n' areas4

"o%en#&at#on -$l#tt#n' 7nte'er "o -tr#n' -tr#n' "o 7nte'er -tr#n' "o Dou,le

Scenario 0 - ,S:C !010 (64-bit, O2, Ot, GL and PGO "est "o%en#&er StrTk "o%en#&er Boost -$l#t StrTk -$l#t sprintf 7nte'er "o -tr#n' Boost 7nte'er "o -tr#n' arma 7nte'er "o -tr#n' StrTk 7nte'er "o -tr#n' atoi -tr#n' "o 7nte'er Boost -tr#n' "o 7nte'er !i -tr#n' "o 7nte'er
Boost

-ource

-#&e 24000000 24000000 9V00000 9V00000 C0000000 C0000000 C0000000 C0000000 CCN00000 CCN00000 CCN00000

"#)e(sec) 3ate P (ro) Basel#ne C.NCNOsec 2O9N3N9.40O4t%s<sec 100.0P, 100.0P 3.N019sec VCN3393.11CVt%s<sec 40.OP, 24N.1P N.N414sec 1O32414.N13Ot%s<sec 100.0P, 100.0P 0.C21Csec 11VC1C14.91VOt%s<sec 14.CP, VO4.3P 3N.C12Csec 2233C40.0NV4nu)s<sec 100.0P, 100.0P 19.3994sec 4123C32.04OOnu)s<sec N4.1P, 1C4.VP V.2N2Csec 12O94349.VN24nu)s<sec 1O.4P, NO2.OP 1.NVV4sec N10O1439.9C22nu)s<sec 4.3P, 22CV.2P N.1C02sec 1O0C43O0.493Vnu)s<sec 100.0P, 100.0P 119.V2V1sec O39C0N.3O12nu)s<sec 2309.2P, 4.3P 2.19N1sec 4031O23C.VV29nu)s<sec 42.3P, 23N.9P

-ource
StrTk atof Boost !i StrTk

"est -#&e "#)e(sec) 3ate P (ro) Basel#ne -tr#n' "o 7nte'er CCN00000 1.C1C1sec 4CVOOOO3.N4VVnu)s<sec 3N.0P, 2C4.9P -tr#n' "o Dou,le 30VN0000 1N.230Vsec 201239V.O122nu)s<sec 100.0P, 100.0P -tr#n' "o Dou,le 30VN0000 N2.9244sec NO912O.CCVVnu)s<sec 34O.4P, 2C.OP -tr#n' "o Dou,le 30VN0000 2.CVVNsec 10V92313.NCN3nu)s<sec 1C.CP, N31.3P -tr#n' "o Dou,le 30VN0000 1.V0V9sec 190O39ON.OVO9nu)s<sec 10.NP, 94O.CP

Scenario 1 - ,S:C !010 (O2, Ot, GL and PGO "est -#&e "o%en#&er 24000000 StrTk "o%en#&er 24000000 Boost -$l#t 9V00000 StrTk -$l#t 9V00000 sprintf 7nte'er "o -tr#n' C0000000 Boost 7nte'er "o -tr#n' C0000000 arma 7nte'er "o -tr#n' C0000000 StrTk 7nte'er "o -tr#n' C0000000 atoi -tr#n' "o 7nte'er CCN00000 Boost -tr#n' "o 7nte'er CCN00000 !i -tr#n' "o 7nte'er CCN00000 StrTk -tr#n' "o 7nte'er CCN00000 atof -tr#n' "o Dou,le 30VN0000 Boost -tr#n' "o Dou,le 30VN0000 !i -tr#n' "o Dou,le 30VN0000
Boost

-ource

"#)e(sec) 3ate P (ro) Basel#ne 9.4O1Nsec 2N33910.4OV9t%s<sec 100.0P, 100.0P 2.CCC9sec C30OOCV.9292t%s<sec 30.NP, 32O.CP O.2291sec 132O9VN.9O0Vt%s<sec 100.0P, 100.0P 1.1301sec C494V10.9VV4t%s<sec 1N.VP, V39.VP 3C.2NOVsec 20910CC.C03Cnu)s<sec 100.0P, 100.0P 2C.9931sec 2ON92OO.4OV9nu)s<sec ON.OP, 131.9P 4.91O3sec 1V2V92N4.0190nu)s<sec 12.CP, OOC.0P 1.C2O0sec 43OCVC3C.02O9nu)s<sec 4.OP, 2093.9P V.00OVsec 14O3143N.C942nu)s<sec 100.0P, 100.0P 1CN.49NNsec 4OO100.V4O4nu)s<sec 30CO.0P, 3.2P 2.N0V0sec 3N314OCN.C3O0nu)s<sec 41.OP, 239.OP 2.209Nsec 400N4213.0O3Vnu)s<sec 3V.OP, 2O1.CP 1O.V43Nsec 1O3O1O9.9302nu)s<sec 100.0P, 100.0P OC.VN2Csec 3C9VCO.399Onu)s<sec 44N.OP, 22.4P 3.C034sec C0NC494.1994nu)s<sec 21.NP, 4V3.CP

-ource
StrTk

"est -#&e "#)e(sec) 3ate P (ro) Basel#ne -tr#n' "o Dou,le 30VN0000 2.04N0sec 149COOC0.2310nu)s<sec 11.NP, CV2.OP

Scenario ! - ,S:C !007 S'1 (O2, Ot, GL and PGO "est -#&e "o%en#&er 24000000 StrTk "o%en#&er 24000000 Boost -$l#t 9V00000 StrTk -$l#t 9V00000 sprintf 7nte'er "o -tr#n' C0000000 Boost 7nte'er "o -tr#n' C0000000 arma 7nte'er "o -tr#n' C0000000 StrTk 7nte'er "o -tr#n' C0000000 atoi -tr#n' "o 7nte'er CCN00000 Boost -tr#n' "o 7nte'er CCN00000 !i -tr#n' "o 7nte'er CCN00000 StrTk -tr#n' "o 7nte'er CCN00000 atof -tr#n' "o Dou,le 30VN0000 Boost -tr#n' "o Dou,le 30VN0000 !i -tr#n' "o Dou,le 30VN0000 StrTk -tr#n' "o Dou,le 30VN0000
Boost

-ource

"#)e(sec) 3ate P (ro) Basel#ne 9.VN33sec 24CV1C4.C2C2t%s<sec 100.0P, 100.0P 3.4O4Csec V90V943.9N29t%s<sec 3N.9P, 2OO.CP 10.2V00sec 93NVO4.O490t%s<sec 100.0P, 100.0P 1.3O93sec V9N9C30.0VN2t%s<sec 13.4P, O43.CP 24.V42Osec 324V39O.C2COnu)s<sec 100.0P, 100.0P 2O.NCVNsec 2C999VC.NON3nu)s<sec 111.9P, C9.3P N.4CV4sec 14NC1V30.V9V3nu)s<sec 22.2P, 449.1P 2.4224sec 3302N441.12NVnu)s<sec 9.CP, 101O.2P N.929Osec 14924C14.CVC3nu)s<sec 100.0P, 100.0P 1CV.13O2sec 4ON4NN.VVV0nu)s<sec 3139.0P, 3.1P 2.0CO4sec 4239O44V.1C04nu)s<sec 3N.2P, 2C4.0P 2.04CNsec 432021V0.13O1nu)s<sec 34.NP, 2C9.4P 1C.04NCsec 1V9C4NN.0OVOnu)s<sec 100.0P, 100.0P OO.4N2Osec 39NO2N.4111nu)s<sec 429.2P, 23.2P 3.9V31sec OO33CC1.1294nu)s<sec 21.9P, 4NN.3P 2.0O23sec 14O9023V.0C04nu)s<sec 11.4P, CO0.CP

Scenario " - Intel C++ v11.1.050 IA-"! (O2, Ot, !ipo, !"#ost and PGO "est -#&e "o%en#&er 24000000 StrTk "o%en#&er 24000000 Boost -$l#t 9V00000 StrTk -$l#t 9V00000 sprintf 7nte'er "o -tr#n' C0000000 Boost 7nte'er "o -tr#n' C0000000 arma 7nte'er "o -tr#n' C0000000 StrTk 7nte'er "o -tr#n' C0000000 atoi -tr#n' "o 7nte'er CCN00000 Boost -tr#n' "o 7nte'er CCN00000 !i -tr#n' "o 7nte'er CCN00000 StrTk -tr#n' "o 7nte'er CCN00000 atof -tr#n' "o Dou,le 30VN0000 Boost -tr#n' "o Dou,le 30VN0000 !i -tr#n' "o Dou,le 30VN0000 StrTk -tr#n' "o Dou,le 30VN0000
Boost

-ource

"#)e(sec) 3ate P (ro) Basel#ne 10.009Vsec 239OV9O.OC3Vt%s<sec 100.0P, 100.0P 3.1C3Osec ON3C41V.CN41t%s<sec 31.CP, 314.4P 9.N4N0sec 100NOV0.0310t%s<sec 100.0P, 100.0P 1.4292sec VO1VC93.13N9t%s<sec 14.9P, VVO.CP 23.C9O9sec 334ONOO.NC24nu)s<sec 100.0P, 100.0P 2O.NV1Csec 2902NVN.204Nnu)s<sec 11N.3P, CV.OP 4.VV00sec 1O1VO20C.OVN4nu)s<sec 19.4P, N12.CP 2.C4N0sec 2C119CNO.2O3Vnu)s<sec 11.9P, C40.0P N.93CVsec 14902V10.C922nu)s<sec 100.0P, 100.0P 1C0.NCNVsec 4900O2.4001nu)s<sec 3040.CP, 3.2P 2.N2O3sec 3N01O0O3.CV39nu)s<sec 42.NP, 234.9P 1.CO1Csec 4O2C1492.12COnu)s<sec 31.NP, 31O.2P 1C.43NOsec 1VV2N3C.0C10nu)s<sec 100.0P, 100.0P OC.1N43sec 3921O2.9N9Cnu)s<sec 423.9P, 23.NP 2.C321sec 10C223N3.0N10nu)s<sec 1N.3P, VN0.9P 2.2930sec 133VVN41.NN1Nnu)s<sec 12.4P, C03.9P

Scenario & - #CC &.1 (O$, PGO "est -#&e "o%en#&er 24000000 StrTk "o%en#&er 24000000 Boost -$l#t 9V00000 StrTk -$l#t 9V00000 sprintf 7nte'er "o -tr#n' C0000000 Boost 7nte'er "o -tr#n' C0000000 arma 7nte'er "o -tr#n' C0000000 StrTk 7nte'er "o -tr#n' C0000000 atoi -tr#n' "o 7nte'er CCN00000 Boost -tr#n' "o 7nte'er CCN00000 !i -tr#n' "o 7nte'er CCN00000 StrTk -tr#n' "o 7nte'er CCN00000 atof -tr#n' "o Dou,le 30VN0000 Boost -tr#n' "o Dou,le 30VN0000 !i -tr#n' "o Dou,le 30VN0000 StrTk -tr#n' "o Dou,le 30VN0000
Boost

-ource

"#)e(sec) 3ate P (ro) Basel#ne 9.2N10sec 2N9430N.434Ot%s<sec 100.0P, 100.0P 3.9O1Osec V042VCC.NO34t%s<sec 42.9P, 232.9P N.0V40sec 1C9NO2C.2331t%s<sec 100.0P, 100.0P 1.N411sec V229231.C3C4t%s<sec 30.4P, 32C.NP 14.OC0Osec N4124OO.0993nu)s<sec 100.0P, 100.0P 19.1131sec 41CNV20.OO0Onu)s<sec 129.3P, OO.3P V.44NNsec 12411C0C.2C41nu)s<sec 43.VP, 229.3P 4.N1O4sec 1OO093V4.N349nu)s<sec 30.NP, 32O.1P N.2139sec 1V9O3O21.V103nu)s<sec 100.0P, 100.0P N0.N32Vsec 1ON1344.C49Cnu)s<sec 9V9.1P, 10.3P 1.9V94sec 4493OV12.CC3Nnu)s<sec 3O.OP, 2V4.OP 1.900Csec 4VNNCO0V.NC33nu)s<sec 3V.4P, 2O4.2P V.V9ONsec 4NOV32C.303Vnu)s<sec 100.0P, 100.0P 29.V3ONsec 10341V2.2422nu)s<sec 442.NP, 22.NP 2.9CN2sec 102VO43N.O13Cnu)s<sec 44.NP, 224.3P 1.N9V1sec 1920293O.1409nu)s<sec 23.CP, 419.VP

Scenario 1 - #CC &.1 (O$, PGO Intel Ato( *&10 "est -#&e "o%en#&er 24000000 StrTk "o%en#&er 24000000 Boost -$l#t 9V00000 StrTk -$l#t 9V00000 sprintf 7nte'er "o -tr#n' C0000000 Boost 7nte'er "o -tr#n' C0000000 arma 7nte'er "o -tr#n' C0000000 StrTk 7nte'er "o -tr#n' C0000000 atoi -tr#n' "o 7nte'er CCN00000 Boost -tr#n' "o 7nte'er CCN00000 !i -tr#n' "o 7nte'er CCN00000 StrTk -tr#n' "o 7nte'er CCN00000 atof -tr#n' "o Dou,le 30VN0000 Boost -tr#n' "o Dou,le 30VN0000 !i -tr#n' "o Dou,le 30VN0000 StrTk -tr#n' "o Dou,le 30VN0000
Boost

-ource

"#)e(sec) 3ate P (ro) Basel#ne 29.13O0sec C23V9N.43C9t%s<sec 100.0P, 100.0P 12.3V0Osec 1941V44.0499t%s<sec 42.4P, 23N.OP 1V.N2V1sec NC0C99.9O2Vt%s<sec 100.0P, 100.0P 4.9102sec 19NN110.2V11t%s<sec 29.OP, 33V.NP N0.34NVsec 1NC901N.V11Cnu)s<sec 100.0P, 100.0P 91.14ONsec COOV9C.1401nu)s<sec 1C1.0P, NN.2P 21.C904sec 3VN4NVC.CO12nu)s<sec 43.4P, 229.9P 12.1COOsec VNV4009.92O4nu)s<sec 24.2P, 413.0P 1O.VV1Nsec N010C9V.NOVCnu)s<sec 100.0P, 100.0P 191.944Vsec 4V10O0.N3NOnu)s<sec 10CV.OP, 9.2P V.2C0Csec 14090NV1.O119nu)s<sec 3N.NP, 2C1.1P V.1NN2sec 143OC0CV.C20Cnu)s<sec 34.CP, 2CV.9P 21.4CVNsec 142V4O4.102Onu)s<sec 100.0P, 100.0P 139.C1VVsec 21921N.O409nu)s<sec VN0.OP, 1N.3P 11.391Vsec 2V90NVO.9223nu)s<sec N3.0P, 1CC.VP V.439Vsec 4ON9V0C.O02Onu)s<sec 29.9P, 333.VP

Scenario 5 - #CC &.1 (O$, PGO Intel Aeon )11&0 "est -#&e Boost "o%en#&er 24000000 StrTk "o%en#&er 24000000 Boost -$l#t 9V00000 StrTk -$l#t 9V00000 sprintf 7nte'er "o -tr#n' C0000000 Boost 7nte'er "o -tr#n' C0000000 arma 7nte'er "o -tr#n' C0000000 StrTk 7nte'er "o -tr#n' C0000000 atoi -tr#n' "o 7nte'er CCN00000 Boost -tr#n' "o 7nte'er CCN00000 !i -tr#n' "o 7nte'er CCN00000 StrTk -tr#n' "o 7nte'er CCN00000 atof -tr#n' "o Dou,le 30VN0000 Boost -tr#n' "o Dou,le 30VN0000 !i -tr#n' "o Dou,le 30VN0000 StrTk -tr#n' "o Dou,le 30VN0000 -ource "#)e(sec) 3ate P (ro) Basel#ne O.NVNOsec 31O221V.COCOt%s<sec 100.0P, 100.0P 2.O3O9sec COVNC32.C290t%s<sec 3V.1P, 2OV.3P 3.0O0Vsec 312V3CV.112Vt%s<sec 100.0P, 100.0P 1.12O9sec CN1113V.2C99t%s<sec 3V.OP, 2O2.2P 10.9012sec O33CV42.9V3Cnu)s<sec 100.0P, 100.0P 12.331Osec V4CO32C.OCO2nu)s<sec 113.1P, CC.3P 3.O202sec 21N042V0.VVV0nu)s<sec 34.1P, 293.0P 2.N1C3sec 31OVC042.4V12nu)s<sec 23.1P, 432.CP 4.00COsec 220OO03O.V3NOnu)s<sec 100.0P, 100.0P 30.3VN9sec 29144N4.4393nu)s<sec ONO.4P, 13.2P 1.O9OVsec 49231CO1.N4N4nu)s<sec 43.3P, 223.0P 1.O3C4sec N090CCC1.O303nu)s<sec 43.3P, 230.NP N.211Csec NCC0C43.932Cnu)s<sec 100.0P, 100.0P 21.NN4Vsec 14219VV.9N3Cnu)s<sec 413.NP, 24.1P 3.2149sec 9N33C40.311Cnu)s<sec V1.VP, 1V2.1P 1.3929sec 22003VV1.2944nu)s<sec 2V.OP, 3O4.1P

Scenario 6 - #CC &.1 (O$, PGO Intel Aeon A1510 "est -#&e "o%en#&er 24000000 StrTk "o%en#&er 24000000 Boost -$l#t 9V00000 StrTk -$l#t 9V00000 sprintf 7nte'er "o -tr#n' C0000000 Boost 7nte'er "o -tr#n' C0000000 arma 7nte'er "o -tr#n' C0000000 StrTk 7nte'er "o -tr#n' C0000000 atoi -tr#n' "o 7nte'er CCN00000 Boost -tr#n' "o 7nte'er CCN00000 !i -tr#n' "o 7nte'er CCN00000 StrTk -tr#n' "o 7nte'er CCN00000 atof -tr#n' "o Dou,le 42CC0000 Boost -tr#n' "o Dou,le 42CC0000 !i -tr#n' "o Dou,le 42CC0000 StrTk -tr#n' "o Dou,le 42CC0000
Boost

-ource

"#)e(sec) 3ate P (ro) Basel#ne 4.1944sec NO21901.2924t%s<sec 100.00P, 100.00P 2.N0COsec 9NVVCV0.39NVt%s<sec N9.C1P, 1VO.19P 2.9104sec 329CN20.2014t%s<sec 100.00P, 100.00P 1.110Nsec CV44949.2334t%s<sec 3C.1NP, 2V2.0CP 9.O14Csec C234C40.3N3Onu)s<sec 100.00P, 100.00P 11.NO2Vsec V912CV0.O120nu)s<sec 119.12P, C3.94P 4.0C32sec 19N92V20.439Nnu)s<sec 42.03P, 23O.92P 2.2204sec 3V029V41.NCV1nu)s<sec 22.CNP, 43O.N2P 3.102Csec 2CN22C3V.1NV1nu)s<sec 100.00P, 100.00P V.12O0sec 1444414N.2249nu)s<sec 19O.4VP, N0.V4P 1.N313sec NOO94031.21N3nu)s<sec 49.3NP, 202.V2P 1.4409sec V141OC14.V3V2nu)s<sec 4V.43P, 21N.32P V.1342sec V990292.VN49nu)s<sec 100.00P, 100.00P 2C.94V1sec 14C13O4.9232nu)s<sec OO2.3OP, 21.19P 3.VN49sec 11O3213O.3NNOnu)s<sec N9.NCP, 1VO.C3P 1.3CV0sec 3093OVC3.0O92nu)s<sec 22.N9P, 442.NCP

*ote 1/ "he tests are co)$#led w#th s$ec#(#c o$t#)#sat#on (la's to $roduce the ,est $oss#,le results (or the res$ect#ve co)$#lers and arch#tectures. 8urther)ore the tests are run nat#vely (no v#rtual#&at#ons were used) on an al)ost co)$letely #dle )ach#ne so as to reduce #nter(erence (ro) ,ac%'round $rocesses. "he Boost vers#on used was 1.4C. 8urther)ore the standard l#,rar#es #nclud#n' l#,c were re,u#lt (or the l#nu. syste) ,ased tests, us#n' arch#tecture s$ec#(#c (la's and o$t#)#&at#ons. "he (ollow#n' #s a ta,le )a$$#n' the scenar#os to the#r res$ect#ve arch#tectures4 -cenar#o Arch#tecture "h#n%Pad GN10 (&+-7it =ntel >uad ?ore , treme i)-$%(@A %.(BCD! #&B7 E8A! 0 :indows )) 13 "h#n%Pad .V1 (*%-7it =ntel ?ore % 0uo %.+BCD! %B7 E8A! :indows )) 4 "h#n%Pad .V1 (*%-7it =ntel ?ore % 0uo %.+BCD! %B7 E8A! 2buntu #(.#() N Acer As$#re 2ne (*%-7it =ntel 8tom F+G( #.&BhD! #B7 E8A! 2buntu #(.#() V :P Prol#ant D53C0RV (&+-7it =ntel @eon ,GG+( %.GBCD! 'B7 E8A! 2buntu #(.#() O *usto) ,o. (&+-7it =ntel @eon @G&G( %.&&BCD! *%B7 E8A! 2buntu #(.#() *ote !/ "he $ercenta'es #n the (#nal colu)n re$resent the $ercenta'e o( the current row versus the ,asel#ne #n total runn#n' t#)e and rate res$ect#vely. 8or the (#rst $ercenta'e the lower the value the ,etter and (or the second $ercenta'e the h#'her the value the ,etter. "he ,asel#ne used (or a s$ec#(#c co),#nat#on o( tests #s de(#ned #n the (ollow#n' ta,le4 "est *o),#nat#on Boost, -tr"% Boost, -td5#,<-"5, -$#r#t, -tr"% -td5#,<-"5, -$#r#t, -tr"% Basel#ne Boost -td5#,<-"5 -td5#,<-"5

*ote "/ "he test s#&es are set such that no s#n'le run w#ll result #n a runn#n' t#)e less than one second. "h#s #s done so as to ensure that runs $er second results are not dee)ed to have ,een $ro1ected. 7n the (uture these s#&es )ay need to ,e rev#s#ted once 3.N+R:& *PF s$eeds ,eco)e )ore co))on$lace. 8urther)ore the charts re$resent the rate o( o$erat#on over a one second #nterval 7n short, the lar'er the rate the ,etter. *ote &/ "he ,#nar#es used (or the a,ove $er(or)ance tests can ,e downloaded (ro) here *ote 1/ 7t would ,e 'reat to have co)$ar#sons (or other arch#tectures. 7( you can $rov#de access to shell accounts w#th R** 4.N+ or *lan'<55B9 2.0+ (or the (ollow#n' arch#tectures4 Fltra-PA3* "2 Plus, -PA3*V4 B77, P2G03V<O, $lease (eel (ree to 'et #n contact.

StrTk 3i8rar$ +e endenc$


-tr"% )a%es use o( the Boost l#,rary (or #ts boost::le=ical_cast rout#ne (or ty$es other than P2Ds, and #ts "31 co)$l#ant 3ando) and 3e'e. l#,rar#es. "hese de$endenc#es are not co)$ulsory and can ,e eas#ly re)oved s#)$ly ,y de(#n#n' the $re$rocessor4 strtk1no1tr#1or1boost. "hat sa#d Boost #s an #nte'ral $art o( )odern *++ $ro'ra))#n', and hav#n' #t around #s as ,ene(#c#al as hav#n' access to the -"5, hence #t #s reco))ended that #t ,e #nstalled. 8or B#sual -tud#o users, <oost'ro $rov#des a (ree and easy to use #nstaller (or the latest Boost l#,rar#es that can ,e o,ta#ned (ro) =ere. 8or 5#nu. users, )a#nstrea) d#str#,ut#ons such as F,untu and 3ed :at(8edora) $rov#de easy #nstallat#on o( the Boost l#,rar#es v#a the#r res$ect#ve $ac%a'e )ana'e)ent syste)s. 8or )ore #n(or)at#on $lease consult the readme.t t (ound #n the -tr"% d#str#,ut#on.

Co( iler Su

ort

"he (ollow#n' #s a l#st#n' o( the var#ous co)$#lers that -tr"% can ,e ,u#lt w#th error and warn#n' (ree.

#CC ver#ons 3.1+ ClangB33:, vers#ons 1.0+ Intel C++ Co( iler vers#ons C+ ,S:C vers#ons O.1+ Co(eau CBC++ vers#ons 4.3+ '#I C++ vers#ons 10..+ I<, A3 CBC++ vers#ons 10..+

*ote/ Bers#ons o( co)$#lers $r#or to the ones denoted a,ove "should" co)$#le, however they )ay re/u#re a very len#ent warn#n'<error level ,e set dur#n' co)$#lat#on.

Conclusion
-tr"% was des#'ned w#th $er(or)ance and e((#c#ency as #ts sole $r#)ary $r#nc#$les, and as such so)e o( the ava#la,le #nter(aces )ay not ,e as user-friendly as they should however

that sa#d, the 'a#ns )ade #n other areas ho$e(ully w#ll co)$ensate (or any $erce#ved d#((#cult#es. 5#%e )ost th#n's there #s a trade o(( ,etween $er(or)ance and usa,#l#ty w#th the a,ove )ent#oned to%en#&ers and $ars#n' )ethods. "he or#'#nal a#) was to $rov#de an #nter(ace s#)#lar to that o( the Boost "o%en#&er and -$l#t rout#nes, ,ut to also ava#l the develo$er w#th a,stract#ons and var#ous other s#)$l#(#cat#ons that w#ll ho$e(ully $rov#de the) )ore (le.#,#l#ty and e((#c#ency #n the lon' run. "hat sa#d, to%en#&#n' a str#n' #sn;t the )ost (asc#nat#n' $ro,le) one could tac%le ,ut #t does have #ts #nterest#n' $o#nts when one has a (ew "B o( data to $rocess, do#n' #t $ro$erly could )ean the d#((erence ,etween (#n#sh#n' a s#)$le data $rocess#n' 1o, today or ne.t )onth.

htt$4<<www.$artow.net<$ro'ra))#n'<strt%<#nde..ht)l

You might also like