ࡱ> FZ%JѦAǿJFIF``CCa" }!1AQa"q2#BR$3br %&'()*456789:CDEFGHIJSTUVWXYZcdefghijstuvwxyz w!1AQaq"2B #3Rbr $4%&'()*56789:CDEFGHIJSTUVWXYZcdefghijstuvwxyz ?((((((3m#AZW_'Gڞ O𝆋xw3[os6!ң張}*{4F>~ʿj^'Otρ -~| 7^4RGҾ:D<}rZMh0wW߳o _U|;c4O g|!{Vk/l?)iOXog ?E7|o/s|a|LׂxI5=S1x/OM!4ߊ^ |1xú}o"XўFQm̖S( ( ( ( ( ( ( ( (?hz_[ZW[o);"{|q}'뫬kv[Ȯy/zRLӟYg;~Ծy_#~x]Լm|@נVMJ7o 'ULc}`Wj >xgG\hq izLw6xsOŞ.5 OOt hڝ2ޝ;M>((((((+j*o૷Oׁ߱4?^Ciu$MiV]T܍+; bmeIm-E~»?hiN?b_rHx vVg 0||?{Y|-g֙kB^68f/-.#y ([E~»?hiNGEOu~Q_?8$/{w4 ~@W+9/ ?]?qH4_'P 3T3ڗŸ>0|uM~rdߌ~>;ÿ &AE#~ҷ^ O5m Zx]RO~J}ؿ֭JiV>(%?&gU/৾1>&τm7am!>"|9_ֵ x_|[e]ԴOk>?p'[O?D?x_ >_׿gW񿊯S>tk]K`o;?f/S%vhC'~;jZ~a}7/^?ٻľm}:A+ ,<3wM%XĚgD|N#HҼ} ۭ۟>(.x^7j߆߃|7|L~tF>@x~&Ѵ?U~ ,&'? X[⟇?<XjfxZ펵ҼKyCc/ƯR"Gr[iv?hKöIiOX>AD-GE(&@3]+|+/ڧ~ ~|?6%~Z??v:4/'Ï3I!OK}^-e/5 jH|w}?gg883gx:O<' ?GdV {5~)1/!]K/յFKWUfo.n=G~^oFoj?|Q o4q7k[\ڞ4Ԯ˩4Ky;HDW+9/ ?]?qH4_'P i@:WsO_ =E~»?hiNGEOu~Q_?8$/{w4 ~@W5A_R?@2OUS|)bOR]]1<~uWAoot<t ,X8$/{w4 ~G+9/ ]?qH4_'Q i@:?WsO_ =»?hiNȗ U?_:WGEOudkGa>8ش_LCu/tmn/gl}&45+q%ŬJ{_R?nO+⏍t೿h6jjO_?>6?i-~&AxO.Z54k\oZWG?_]Ig?_fۿًῌ5'1G[qc/;<)oK۝v}:NKsQ⾉e>|N<]|C4Οxxd%ŲPKIF\<{? ^:xU/|Y6|y#=sXS2ugqbHt>0!?V~>eO 0_ Q?P:Ǎ|=yS#D4[-* mn.o|k7/?ik)s7 $|W~x^5 d lV/R_;Sybk+ߊ::JtQޓ$6f{ wZY, 9.?#C^ּsF.Z7]᫽;]# k[ E{-*LZ*_I>9DÏKo&coÖj< ?ϊwĽ#IaM []oEnc~?#) TMϋ/O Y8<7O_]ՇO^ g +T֑sWP'tKڻxPޥhG|q_5a?rG&Ưsν|9^Fto|U ^Yx2)<5ieoR-|b*c?R/>1?i +5 i<Έ%~%x}R-S:6k=>;B8Οk4`!薚_ |~ \WfkUƥxRciό_ǃ-n/NͅLULWK7g_PG_wCo (W㿄u_N[hGoOh5}0E:ot'??b/N̿~)wVisaYj{lhyeFIk`EQEQEZ 73~wVhA_ ~?Z77+hL>#m²"o¶?o/ _y7R/׀?c xo7A>4|o5:/]\u#gm[iF*?G袊((~d|P~%hi<K9xk~~1|G'SQ'[HMυ.]*L7W{~ϡ;o?{8xO]xk5>axP|{ڌ_ঞ!Wzߍߵ|+𱭾x!"ׅ~j0)4[YeźN"ٵLq~R_O㿀8+^sҗ6࿏|c4 ~8Yog/|߃߈'w&?<3O*KxᏍ#=e^Eo&}iZwl Q_RO(;ouO?Ɠ|F?d/YjgM;ĶZů|4];ּ)xt/xZo.l(Fߏm?eg-oEMcuOU< aq?QY΅}&gk2m7“3E~<|QxG)/w?~ hU?c¿ĿŽ/:<#:< ol4_ 'Z<5-O忑j_k8+~72>c_Zo?߁>EOx:Evw|,Z4 I 7> Vj@Wm?ƿ tWzw;Ï|li_G|G@V𮱩h<z]&)dgQ|Ahduyb?E3~ %?~1.>ֵ'czo> #G~ZB ~I~|!|Gq&\Q_Nac?wywZ!w8'x7c}4Wwiگ|W'~/m?(((_>/ق_ßB`(|O 'NgGI4/'N5{?gBnm-|*? tZ'ƫMj_x}/—_e U{¾G>GzC f>:'7'cxU|E~c KkFW"Ǐbw+4 )si7S%p߆4smC`WŚy.63k7)<$֒F|=4w4o!ƟoA[|,Su}_M උᅍ5+t]{IxQ'O[DG۲zMeu_ |[qiD/MqƟ#Gg/Ccxώ-EuE#:"|ڟo.a_> ~.q_ k~?/$Ox_Gկ5 wn͍Kvo ?_ꟳ׃cd/ڗP|q)xK跞3|bԴ¿3OQ_vV|DolֱG#_QS Q}wIZt"]|1? %t?_x, Wm:?7|OiK0R|I6m~Eյ}TI1}KSy,4K 7Kl,cU&LjƃAAhSKGҙ`.#-Hg 7hO~YU@:.7ğ 6opxf=6C>7#ze[W76O5_ S2/d_6$_`I S.?'M,#ᯈ_jþ$}XόO~5&tPhVCyښ@|?$o<"o~> 5B0A_| KkӯB&## Q~ǺGymv&?G袾 i/~+|=R|Yoaï>#_ x^ σxp:αI>^o"oڵퟕqkOگw\E%M\yim2~_gW]=>v?ojOW|?__~/b|9xI;dګ Bx_w8|@dG__w=Sg v$B׾K |oCi >8~[3?  Ӿ$WcR5}j ;Z#-/ 'moK?d?'~~=QoWqExv/<3Fi Y5O_|-oY_ZG? {TYn!Y.%kˈeS5ZD {%ydE3W;\|^1 >|Z][{H> ~ɟx:Hi~>-;N}b]F{]NIc7T/gH|ށfſGc/4Pj_=~?lT S~U^3k}pKW?▓v_|Doǟx^)UhҼ)὆ RmG^T~#GֵyuIW:x{ڌׅ5K[?hX f\Q~~ԣE( ( Ѓm_?ؗ Ӭ~~Pڻ෌ 6 vм!uvE KO^?<5O߲?c3HeeC3ڶSa_Ikrʶ:`uvm" Rc7Su4ω6_?;66s.&Oڤ4Oqzׇ-<;i0ܷ4:Vo 6(ƿ3 |=7>.yig|}-LklFϢp=PvO? _W'J/+%O¿㒀?r>%*Wß9L|U૽J(|U_S[C1K=W<1JDrIe'g?v3'>O.?Ŀ -7[;h4{^;gƏ |O,V^$oivzg',[GXW?rQ+\9(Kg=㗈r?jO+o<&~?|W}7ᗂGqSÿ)o>^6,qWK՟쟤~>>1|:7@;a/RF}iZWtd͖5%yyKSԵ2/+%O¿㒀='_KC:ơ+|2_[kxg|<|Wk6 <+h~#L6\Ae/l7?x~ m]4> ?3|<8ޛ]~9R>^xO٧^7_E?kǞ'1xNMkooMM{j6XHhOiwq< q4?_>0&ME1_ߎ, ='KQᆃ^>$xb=GEkz<XW?rQ+\9(:_G {EJ|(cFxψ<{ R~'[o/R? M>iu2ɣ |V  #T. +edIfd5%ify&4PdG-#'XW?rQ+\9(O? _W'J/+%O¿㒀?a_ G?r_ßB`0_(;,?ࡰ[Btk%߆$|x|PNDGn[W/G:xq ڔ5^]o,3a_ @W?rXW?rP:u֧$Jh-_π?f/|N4?3xsx^GkWZu^GŞ!< kA4=J/_O7O/׿hWt{=y,,Ҧ~'wMC 'x ^>}߈o^TXW?rQ+\9(_M> x-?Wǿ =߂!^ o<=7ƚ/ 75]+χ^<> A5en8fm ӼoW:?WWծY# i{n+\9(O?-<[m/oY|%u~9k_^n[2:ziz'ψ|7,6jn\/cSG=o~<6iT-7>'[<+o [wKm|H:~R? _W'J?/+%}o&+Qhw􏃿w5o~~ Z/Wǃl~&Ei7ů?=cgy⋭>Wf_%?eU76_A+<tfMw^:P^Y=R{NQo/+%O¿㒀?/xG}}- -2i7mCl, .$%fKϟa_ YP}+SM'U}>tȵo'$JPkiEz-Ʒ9'_)M@?0/0 m>| Ojk:樶::-t ՛S0f_C~ɲ|7c|!u|=9ik|7дυWDҤ'|K!ׯm{ i3_M^(< #qQk< x.;A}*:Οhc7w?rXW?rPI#Sǟg_~?xcPxᖗ.7='XziگO-浣~%ƭ/_JoS ׅx E|Y?o)f}#MOKo I&u? I6ŸO¿㒏a_ @pO;Vv _-sO/^%55oio:&{C JuJX=?t%b=O⟍?aυt/ *]BRo%O¿㒏a_ @fǿ;=U|WkZ,mn4Mg>z%Οaiu9lWeXeg_XW?rQ+\9(?hX f\Q? _W'JSȟf_ ~?|GGB^0ԴcAҼCmx}d O\ҵhl.ZPOQEQEQE|#࿁>6U)V||5K'~i_CZi$}NCE榳Iqj6:햔? P/?bE6W}lѿhϋ >"[A~+xum/u+Li]ľ'tM3:|(_f_ }3Z_JHw<1[?OXH𾩨kUԭ#➽w.x;f}uo3/VAwx4M,WkvEƏ|VQķfB?o kXx-;{Ax㸱m5 F|!m~Է)|%| >,|ke\gW{xk%SB4{K=rg%߶OB|Mώ^Q I>;V|MoGvZtWI4ޓʚI}LզlO~$Ž3RgaV> _UY_|i7ޭsq0x~6TW¿?|xBԬ%4nkt o <9nGؼU/ >@|7^'>!tm'B >t_1bh> K*jsO;5+wc%~4|Y<=-ͯ5 ź?Š7O?ckPeN5"t? Ŭ~0q]xJuH7j =G/m+oAۯ'~ G?kɨhQx~&^u>G>5_xsLNz+-b+So~_o'ƿ /M~¶WWVtW=çҥ+:xozdۼ2I~^) W/w~:x|=/Sm b_ZjG<,ǷotLqvU)P_7<<>7Ck߉мe@᱓z~'%/>A}[qs{cjז?#ڛ>":vߋ~Dt;鵏|SVll$L#pO-Qu4h} E|o fQOe_]{]ſ~CXh)k'J<9xIvVF魿%R|5?xg6'IR(<|1Co&+"]+FէJG"|YմėO5Y,- ׏t]RJ^Q炧> -+NV_4PX~+v{{#DkWv?п|9\&=j~!SYh|_;Zx~|XĻk>ТXÞW|><;g{k~קVcaK[ A?|dM!-~'[k>"j Z~0NInw ͬx&x:U5hW緍|)|/:sga>ijzoZ?=G|qNu{MuxM5}SVVVkۇ+iO<ɵt"WԼo{ſ|0oNF~W:R{f o xķpUOGdǿkzQ/oYJ_x6Yկ4}R㖋}Gᦟizŝ3_Ծ#|i{CGR5kkkj~';|-qa#!E ⇀>xǝs]֣𗊯#ޅ?x7Q~&׾+7\UtI4?`{Q_xw =0 -/4xXoφ߀~(x@~%+xw[-eExV:|^$ޡi:%֥>m+ oûdOx—jz<|'/kz2#Z5MKGѵ CI>R?d׿i6y.oⰵ?u_v=֭<)kw`|Q_?Y~_ơd I)m_x]<,..=SV̖믈_S{mK];zeww'V?~0|nM3'h<7{]^]w9oIx$jZ]/'< z[ >$-<#~^C/I$iw>xw W/ ^Ҵ?P5 J-KPO~ YGF=k 8~? G/?_vO% ~4tJ /NOmxt%[x>⢾J'xOGψxwžL_ën |EN[zW>]SmsK²z/,W3[K lzaMᾣ[o xR$_ .g[ vXVCqcjͤ MQ_?¿|U%x/Xz;_5K @ωWz'/ kp|+SimOk ؗ7v{ S.o|7kmz/6 MG{[g./?21^xH&}lu?e . i_~ѷ7lKvwz?kֶ0h7icî'|A#K_z} 7_؇ž A[k oSU]&ecxW^ѼI4(i6р{MQ@~^~ֿWߎ($koOKxoH_>3@|i8|.x#<=ewunkoaE~%|r_xwv#x]'W٧e]K¯Vk>&ZïXx+QY%柯qߵg s8~|HƏsO~å_~5`Gͨ7t}}|}(T. oU{ޚ(/7g#xoD7G?Yt/S}wÏ:o앥juJTկw~Zz]棬^i2k^%IԦ{_A5 WI߳DnKOY-fyW͠EGjf՝, P+ًN*^@oŚ~ ^| ~ʟ7xo; ]xS垳K2ռ=?DVWS|LkrCNus|n:J߂>jᶕg;-x{;xCf*EwKύdG<O-?J(O]gZ?egi`/~$x |oƫ59Wk⻟cƞ&ޥ'>7G~~/|U?6|8gğ_Z/'&|?WƟ$v߈VSԴ:4+^/5{/E~o,u߀??gW|N¿ }KwtM3⿊~6xᖥn ~TFMi>x5 ş Jg!ȴx/Zηů_D|kд|QJׇ#xڟďx_ҿh΅aoeaY~]Mw=O5;=FJ8D QFh~)r5/-/OسVѴxS^0]k؇O ^] ucK K5X~$Lju;oh 7,ogg|Qex*'%;X恤ZxJ#ѴkKJG/ Pƺ}?m_|1R|Pem?yw.-ǞBMcZh?>-_y#CMONxsh7ڇbȚ/ae.[8</SAN!T+;aӼxI>,P7h񏃾||[gZs񞹥C𯋮5uگ> ĺ?yyCej^SjS~Q@`f]5_'Ư?Dr!Kh#jx| >|#KMrWHׯ9~ߴ'>!ZO_G 'xs;?v/[T񍆛w2~&koU#~%|&]gM{ω|K?ׅ~)7=??tiޯ|E|I~%xZ4ZV@ ^Zs2on?gC׀/O-'C>4j~ѷvZ΅Vm>.o_隶뚶3b K]GrLC_h3#šû_#u?kZGOxԿ;%i%mS|m,-og|zuZN+ß$;W^mɞ}B$w>O.ٯ`$ɮhoojb&#<Ӿ54K?MK|Gm7[5O_WP?|#:We4cΑ[ykoiݍ?akG@+?i%յo757GF߈xGkׅNJm ƕkzYiOd9hV?>1q|%))֟GMrhǟ4~cZx>Ӵ/_ǀ|B^moBbZ___j.5bWeIk%NjО;vgQ4=Zwu?k [gcK~|W_xC~=Ԣ637Ŀ 5;='棥xL-~$^x7?-5 fscO&Z-~k-~|;OxSÞ)Ӵ/ُ0|mK:o˟x/:"ADV%ƟE~_~?_@žuxc>#UL~ _O_V~5/z5Goy^4Ι7/ۿa_xAWǞ+еO ߴ𖷦>xw"jL<(>hGW|;_Ɩ}EQEQEQEQEQEQEQEQEQEQEQEQEQEQEQEQEQEQEQEQEQEQEQEQEQEQEQEQEQEQEQE!(0 D`/ 0DTimes New Roman(0(B 0 DArialNew Roman(0(B 0 @ .  @n?" dd@  @@``   $66FP12; </ %v/ 8!429 fP9$IH:4>% 4 ". JP6S%TUV,R$%JѦAǿZ 0AA@ g4CdCd@B 0ppp@  <4BdBdlhG 0g4:d:d@B 0pi pʚ;ʚ;<4ddddl%F 0Xr0___PPT10 2___PPT9/ 0? %O =CHigh Performance Index Build Algorithms for Intranet Search EnginesDDjMarcus Fontoura, Eugene Shekita, Jason Zien, Sridhar Rajagopalan, Andreas Neumann fontoura@almaden.ibm.com2kZCb  AgendalOverview and problem description Global analysis Major data structures for index build Index build algorithm$mlO Overview and problem description!!(-Trevi goal is to provide high quality intranet search capability to corporate portals such as w3.ibm.com Scalable text search engine that is being developed by a joint IBM Research and Software Group team This talk focuses on how to efficiently incorporate global analysis into the index build processNi`d`a`ida)]Global analysis (GA)xDuplicate detection Computes fingerprints for each page (64 bit shingle) Master are identified by using the (previous) static rank Anchor text (D1: <a ref= D2 >Trevi</a>) Appends anchor text tokens to documents Static rank Host in-degree, i.e., number of hosts that point to a page (~ PageRank on the IBM intranet) o(( \  o(( \,x^Index build requires GARebuild the inverted text index and update the global analysis (GA) Duplicate documents are deleted from the index Anchor text is indexed together with the document s content Static rank gives the index ordering, allowing for early termination during query evaluation The time to rebuild the index will be dominated by the GA time, as analysis get more complex Semantic searchxD]$]_Major data structuresStore Storage for the tokenized version of each document Index Inverted text index over the Store Delta store and delta index Small versions of the Store and Index with new and modified documents Allow for hourly updates of the Index content3#t3#t  ZIndex build algorithm (1/3)0Index build merges the current version of the Store (Storei) and with the current version of the DeltaStore and generates the new version of the Store and the new Index, Storei+1 and Indexi+1d: t   ,5& U[Index build algorithm (2/3)0!Index build using global analysis""PIndex build algorithm (3/3))Index build using lagging global analysis*Z*\Indexing algorithm(Radix sort Linear time sorting Flexibility in defining the sort criteria Bigger sort buffers increase performance Pipelining load and sort phases T g! g!RExperimental results$#Lagging global analysis does not degrade quality More than 25% of performance improvement Even more advantageous when analysis are more complex Indexing algorithm scales linearly with the number of documents Superior performance when compared to several state-of-the art indexing algorithms X1_1_L#Hardware and software architectures$$$ /pZcd=P  ` ̙33` ` ff3333f` 333MMM` f` f` 3>?" dd@,|?" dd@   " @ ` n?" dd@   @@``@n?" dd@  @@``PR    @ ` ` p>>ZK0 @8 ((  ( ( 6\~   T Click to edit Master title style! ! ( 0    RClick to edit Master text styles Second level Third level Fourth level Fifth level!     SXB ( 0DXB  ( 0D  (  `wawa1?oi g 2004, M. Fontoura    (  `wawa1?`  QVLDB, Toronto, September 2004 H ( 0޽h ? ̙33 $Blank Presentation 0 0(    N$sjhjh s$   n*  I$$IImm  Ntjhjh  0$  p*  I$$IImmd  c $ ?KD  4  Njhjh  J  RClick to edit Master text styles Second level Third level Fourth level Fifth level!     S   TXjhjh s   n*  I$$IImm   Tjhjh  0  p*  I$$IImmH  0j ? ̙3380___PPT10.`EZK0 TL (  r  S h<(  r  S |8( `   XB  0D` ` XB  0D  H  0޽h ? ̙33y___PPT10Y+D=' = @B + ZK0 @$(  r  S (   r  S t(   H  0޽h ? 333WcL+Ҫy___PPT10Y+D=' = @B +y ZK0 `x(  xl x C  (   l x C (   H x 0޽h ? ̙33y___PPT10Y+D=' = @B + ZK0 @$(  r  S (   r  S (   H  0޽h ? ̙33___PPT10i.@#$+D=' = @B + ZK0 P$(  r  S PB(  B r  S $B(  B H  0޽h ? ̙33___PPT10i.q+D=' = @B + ZK0 `$(  r  S B(  B r  S 0B(  B H  0޽h ? ̙33___PPT10i.m+D=' = @B + ZK0  (  r  S !B(  B x  c $X"B( @ B   0#B"``  I Index Build" 0  #RB  s *D  RB  s *D0  0 RB  s *D @ RB  s *D0 @0   <X)B   nStorei20 #k  <.B   r DeltaStore2 0  #k   <3B 2  VStorei+12 0 #k  <8B 2  VIndexi+12 0 #kH  0޽h ? ̙33___PPT10i.`T+D=' = @B + ZK0 25z(  r  S TGB(  B x  c $(HB( N B   <pIB& d  r DeltaStore2 0  #k bF 6   T   # F   0`KB"` MGlobal Analysis"0 #   <RB =  0 #T   #  6v   0QB"` I Index Build" 0  #  <8[Bu =  0 #T  #     0^B"` hDeltaIndex Build"0 #   <bB =  0 #,T  p  #   `  0 p ` B 0  ZB  s *D66  0fBf nStorei20 #kZB  s *DVVZB  s *D   0kB   `  hNewly crawled documents40 #k  0oB[  s DeltaStorej2 0  #k ZB  s *D ZB  s *D0    0tBd  V nStorei20 #k  <4yB  F t DeltaStore4 0  #k ZB  s *D0  ZB   s *D0   ! 0~B v 6 VDupi+140 #kZB " s *D0   # 0؂B0 f & ] AnchorTexti+140  #kZB $ s *D0   % 0B V  WRanki+140 #kZB & s *D ZB ' s *D ZB ( s *D  ) 0(Bp VStorei+12 0 #k * 00Bpf& VIndexi+12 0 #kZB + s *DFFZB , s *D - 0B P  [ DeltaStorej+120  #k . 0B   [ DeltaIndexj+120  #kZB / s *D  ZB 0 s *Dp p TB 1 c $D ZB 2 s *D h2 3 s *"` h2 4 s *"` h2 5 s *"` H  0޽h ? ̙33___PPT10i. @\+D=' = @B + ZK0 )3| (  |r | S B(  B  | BB ( @ B   | 6LB!a nStorei20 #k  | B8B t DeltaStore4 0  #k   | 6B   mGAi40 #kL  |#    | 0B"` MGlobal Analysis"0 # | <@Bb F 0 # L  |# S}  | 0B"` I Index Build" 0  # | <Bb F 0 # L  |#    | 0PB"` hDeltaIndex Build"0 #  | <Bb F 0 # $L  p  |#  ` | 0 p ` |B 0  XB | 0D@ XB | 0D | 6DvI hNewly crawled documents40 #k | 6v 4  s DeltaStorej2 0  #k XB | 0D} } XB  | 0D}}XB !| 0DmmXB "| 0D] ]  #| 68 v `  W GA inputs2 0  #kXB %| 0D    &| 6v s VStorei+12 0 #k '| 6pv m VIndexi+12 0 #kXB (| 0DXB )| 0D   *| 6v C  [ DeltaStorej+120  #k +| 6v  [ DeltaIndexj+120  #kXB ,| 0D XB -| 0Dc cXB .| 0D  XB /| 0D 1| 6#v   mGAi40 #k 2| 6'v# p  UGAi+140 #k  3| BB2 ts <Global Analysis and DeltaIndex build can proceed in parallel =0 =# H | 0޽h ? ̙33y___PPT10Y+D=' = @B +A ZK0 @80$(  r  S  9v(  v x  c $9v(   v  # ^A (sortKey@ [ vH  0޽h ? ̙33___PPT10i. !X+D=' = @B + =ZK0 0 N( 5%< x , c $)(     B* ( 0  H  0޽h ? ̙33y___PPT10Y+D=' = @B + ZK0 p"80 |(  8r 8 S lv(   v 8   8`dT @ `  8#  Z  h 8 s *"`P ` N (`  8 @ ~ h 8 s *"`(`  8 0v"`o  L Query Server$ 0( 2 #T ` 8# bzsh 8 s *"`` 8 0$Mv"` $7b GCrawler$0( 2#T `  8#  sh 8 s *"``  8 0\v"`W  K Index Build$ 0( 2 #b 8 6v"`F  OCrawled Documents"0 #b 8 6v"`<   !Store Index DeltaStore DeltaIndex""0 "#,  ZB 8 s *DHsHZZB 8 s *DnsnZn 8 0"`L }  8 6 v"`s   TLocal Gigabit Switch$0( 2#ZB 8 s *D s Z ZB 8 s *Dz ZB 8B s *D f 8 6$ f^  8 B(vR \  E data copy 0  #l 8 <6   8 B$v|6 4  E data copy 0  #n 8 0"`b t  8 6v"` M&  J IP Sprayer$ 0( 2 #ZB 8B s *Dts ZB 8 s *D& 3  8 <ľv# F _Link to the global IBM Intranet$ 0( 2 # b 8 0v"`  !Store Index DeltaStore DeltaIndex""0 "#,  ZB 8 s *D M 3 H 8 0޽h ? ̙33y___PPT10Y+D=' = @B + 0 \ (  \X \ C KD   v \ S v J  v "H \ 0j ? ̙33 0  ( v@p@ X  C KD   v  S v J  v "H  0j ? ̙33 0  (  X  C KD     S , J   "H  0j ? ̙33R 0  ,(  ^  S KD   v  c $(v J  v "H  0j ? ̙33rt@!`25SDP+ZLc $:8 gԠBPJ7:X=?eOh+'0 hp $ D P \hp.Enhancing Framework Development and UsabilitywemafecinCC:\Program Files\Microsoft Office\Templates\Blank Presentation.pot IBM_USERm F126Microsoft PowerPointoso@AV}@U@P aGx g  6  --$--'--%K--'--%K--'@Arial-. 2 N 2004, M.   ."Systemti-@Arial-. 2 Fontoura   .-@Arial-. 32 VLDB, Toronto, September 2004h       .-@Arial-. 3332 ^High Performance Index Build h,"")""3"""""")"".-@Arial-. 3362 6^Algorithms for Intranet Search )"""3""""")""".-@Arial-. 332 Enginesa)"""".-@Times New Roman-. 332 Marcus a#  .-@Times New Roman-. 332 5Fontoura  .-@Times New Roman-. 2  , Eugene .  .-@Times New Roman-. 2 SShekitaa .-@Times New Roman-. 2 , Jason  .-@Times New Roman-.  2 $Zien .-@Times New Roman-. 2 $$ , Sridhar   .-@Times New Roman-. 2 $ Rajagopalan  .-@Times New Roman-. 2 $v , Andreas   .-@Times New Roman-. 2 NNeumann .-@Times New Roman-. 33+2 fontoura@almaden.ibm.com   $   .---%K--'--%K--'՜.+,0<    On-screen ShowUniversity of Waterloo{  Times New RomanArialBlank PresentationDHigh Performance Index Build Algorithms for Intranet Search EnginesAgenda!Overview and problem descriptionGlobal analysis (GA)Index build requires GAMajor data structuresIndex build algorithm (1/3)Index build algorithm (2/3)Index build algorithm (3/3)Indexing algorithmExperimental results$Hardware and software architectures  Fonts UsedDesign Template Slide Titles  _4IBM_USERIBM_USER  !"#$%&'()*+,-/0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~Root EntrydO)PicturesZCurrent UserSummaryInformation(PowerPoint Document(.XDocumentSummaryInformation8