本帖最後由 hardrock 於 2013-11-22 14:34 編輯
- D0 R0 H' N% k' b$ t
W: |$ P/ n5 Z. e3 Irobots.txt文件要放在網站根目錄下,最基本的檢查方法就是用你的域名後面直接跟上robots.txt訪問,如果能訪問,那放置的位置就對了。
3 Y0 k9 F! }5 u0 `8 U$ u1 o8 G6 b+ O: n# k/ ^
找到份代碼,- User-agent: ** o* j- u, O( B( p& V l
- Disallow: /cgi-bin/
5 ]% ^1 O2 v2 p) ~ - Disallow: /wp-admin/
& v3 L) l z g% v' W5 { - Disallow: /wp-content/cache/+ F9 t2 H$ f6 ~; ? l5 Y- K
- Disallow: /wp-content/languages/
9 \9 `# ~. a. g7 Z. x3 q6 {" [ - Disallow: /wp-content/plugins/
; ]# o( J; y; N! B( x) E: o" o - Disallow: /wp-content/themes/$ W! ~% A% g/ `" ]! {2 d) p1 j. z
- Disallow: /wp-content/upgrade/, v2 W- f4 {" @. D) |
- Disallow: /wp-includes/
0 t+ Z4 A$ F2 w1 ^ - Disallow: /comments/
4 R2 p: R/ o: h! {' X' v - Disallow: /category// [( J* f' [2 s8 L: J# k! p: [# ~
- Disallow: /tag/
* r: j, ?8 F% Y: l& e6 m" P - Disallow: /page/
3 ^2 p& M* y( s! t' o: a- s - Disallow: /feed/
' c. x& r% c3 R" n0 k" j# N0 r( M - Disallow: /author/
w/ w3 c. B# l# s# q - Disallow: /trackback/1 v0 b) h8 f7 S) |
- Disallow: /2010/
( @5 B# b. U& p/ ~' }$ c( I1 ~ - Disallow: /2011/
5 \- o/ W2 Q" ?1 _3 I1 q' b - Disallow: /2012/
7 V" X: ?( z: ]0 f+ m2 O - Disallow: /2013/
: [( J4 o% N; W0 \' { - Disallow: /*/feed/4 ]2 E% W6 H: w8 h
- Disallow: /*/trackback/" ^' p/ O2 f& Y' |& ^( H1 M
- Disallow: /*?1 Q+ v# i b+ x3 L
- Disallow: /*/*?
0 B& b9 p" B) L5 N: }$ r1 G! u - Disallow: /*/*/*?0 E* T- [& C7 X% a* i, @1 A% W
- Disallow: /*.php$ D+ G- F1 d7 i
- Disallow: /*.js$
( S4 Z' p3 A& h% n6 ? - Disallow: /*.inc$
' |. V) U( T- V! P - Disallow: /*.css$( u8 P# o4 ?( r) T/ c
- 4 O: W7 q, R% K9 n0 Q
- # Google Image) V7 M0 c) ^ [1 u
- User-agent: Googlebot-Image
! i3 A: T" Y! g: x$ P* ~ - Disallow:
+ c7 h1 y) L/ `8 Z+ v$ ^/ j( [ - Allow: /
( c2 z2 |6 w; Y5 M3 G/ V -
& u4 I* i6 \; R M8 w2 f - # Google AdSense
( a6 J1 E* x+ Y: }2 k" f: y - User-agent: Mediapartners-Google*" M, [+ [- y; j1 |- U& \) ~ N
- Disallow:
4 m ^, H3 n: L7 F - Allow: /- Q* a$ k# s0 w8 I
-
3 [5 c( j4 A3 x: x - # digg mirror( H9 Z! G2 t2 E l
- User-agent: duggmirror3 s& J- V5 w% k9 `9 v8 q- ~
- Disallow: /7 o9 |% r) S, R& u' I- M5 R
- ; h% }. R# A4 Z1 u E
- # Alexa archiver
) _! R0 N6 u+ S: s; i - User-agent: ia_archiver
+ E- ~( t$ X; e+ @) `. T# j# b - Disallow: /5 ^' k7 y" H3 `1 U0 b7 C/ Q
-
5 x8 s+ N; B+ y - Sitemap:http://www.xxx.com/sitemap.xml
2 {8 @3 y1 q G0 n; P3 O7 T, `" H7 w - Sitemap:http://www.xxx.com/sitemap_baidu.xml
複製代碼 問題是這份代碼適用於中文站用於百度,我是做英文站要適用於google, 以上代碼怎樣改成適用英文站的?3 o8 B8 k1 x6 p6 z1 b* O
對於代碼 一竅不通...4 P, l" o W: w
/ G2 p1 x" G/ A" U* ]3 c
主要疑問是31----47行的代碼,既然是英文站,這幾行代碼應該是允許的吧?中文站才禁止抓取?$ _5 b5 m5 J: R) M7 q
( B% B/ I. b5 W( u0 a
/ @' D3 |8 H8 F1 Z$ I( M6 ]9 G. w2 i
1 Z" \+ n% ?8 [4 \% [' m$ c補充內容 (2013-12-22 17:43):8 m5 K: U' V& K5 Z# T" D8 E2 K2 a
沒這麼複雜,下面的就可以了: Z8 B, p, z2 s% [* ~
Sitemap: hxxp://www.xxx.com/sitemap.xml; m0 o4 x. A, p: d
User-agent: *
& L% ~7 g, [0 \4 W1 ^9 I5 Q: ZDisallow: /cgi-bin/
7 G" V% K3 S+ d3 E" P; j5 PDisallow: /wp-*
5 ~: E, \. K) g! I/ w- k% G6 H% g9 t2 E( g9 [8 J
補充內容 (2013-12-27 17:17):! r, x z6 n& K& f
http://blog.csdn.net/wallacer/article/details/654289 |