本帖最後由 hardrock 於 2013-11-22 14:34 編輯
3 c4 j! n7 m" M9 c
" H7 w/ y G- D/ C4 G6 ^& Jrobots.txt文件要放在網站根目錄下,最基本的檢查方法就是用你的域名後面直接跟上robots.txt訪問,如果能訪問,那放置的位置就對了。
4 v: N8 g- g5 V( w# Y# i
& f0 C) K7 T7 m- ~9 x找到份代碼,- User-agent: *
: m" {6 m C0 z) H6 V) s - Disallow: /cgi-bin/7 g4 X" B0 r5 |+ @ @% u
- Disallow: /wp-admin/
3 _. s/ h9 I1 `" P# K: p/ e4 q5 Z - Disallow: /wp-content/cache/0 O3 ]# Y P1 W5 [8 k
- Disallow: /wp-content/languages/
5 m: J! y" q- M Z+ T( d - Disallow: /wp-content/plugins/% F0 k6 I9 O6 q3 L$ }
- Disallow: /wp-content/themes/
p* O$ g& Q( w; j$ \& {2 q - Disallow: /wp-content/upgrade/2 }6 s0 i# ?& Q* c- E& x2 E
- Disallow: /wp-includes/, J( G$ d; _) C6 V* Z* _" G
- Disallow: /comments/
5 o: l. ~+ j" Y% |2 C - Disallow: /category/4 i4 a$ v* q% K* ~/ ]" B
- Disallow: /tag/
% p+ ?8 u3 h: n: J1 A* f6 L - Disallow: /page/
5 O8 H" ]+ t* ^5 Z8 E - Disallow: /feed/7 u! t9 g2 d9 m! `; }* u
- Disallow: /author/
, x' L$ C8 @2 W1 G. L, V - Disallow: /trackback/* p# m3 R# x! G: A7 }
- Disallow: /2010/( {& s2 B% y( Z7 P5 E$ ?, U/ K
- Disallow: /2011/: t5 r/ I' S& W# {- e( v, r
- Disallow: /2012/$ m5 p& `/ w) K& [. R
- Disallow: /2013/" s& [4 j, A7 ~9 S2 a
- Disallow: /*/feed/
! N. h6 {8 j7 E7 N2 L1 S" I3 } - Disallow: /*/trackback/# N, z# r; `- z3 g2 i7 E5 b
- Disallow: /*?1 `" e7 d2 ~, N3 c6 R
- Disallow: /*/*?* M9 Q, E3 E) C. @5 B
- Disallow: /*/*/*?
) B. O6 w1 Q' Q& u6 T - Disallow: /*.php$4 {8 s) Q9 F& b5 X5 x
- Disallow: /*.js$
% l4 u8 r3 {8 K! G. e$ R - Disallow: /*.inc$5 C/ M# n* i4 b8 p, g
- Disallow: /*.css$& {+ \& A3 s& q$ K% a7 u9 k8 b
- + |0 W- q( O. ]. K; \8 S
- # Google Image
- q+ W" ]8 p& K4 y - User-agent: Googlebot-Image3 t1 \! f& g* w0 t
- Disallow:
+ Q4 K4 O6 y* M$ V, h7 b e - Allow: /0 P. ]1 `" r) v. ]
- 1 D& o) i6 k5 R" P4 \. `6 a
- # Google AdSense
, P2 P; W1 H& R6 i( Y - User-agent: Mediapartners-Google*0 b& i X, A& Z
- Disallow:/ z9 H4 Y" @. i: d$ |
- Allow: /$ C! m! [4 G6 a% _ w. }
- - @* N, \9 {. h/ M6 f7 g
- # digg mirror5 S2 v) ]- q& g! v( T/ l, `
- User-agent: duggmirror
) l0 F8 _, a. [4 ? V - Disallow: /) J2 ?3 g( P1 N( p
-
/ {' R0 J; ~) e% C - # Alexa archiver* G N: X- M3 A; W1 I: [3 \
- User-agent: ia_archiver
3 \) n0 {4 S; z( D - Disallow: /( H7 ~0 P- A2 l9 b
- ! M i4 p4 i" O
- Sitemap:http://www.xxx.com/sitemap.xml, F* l \7 g6 Z" ^2 I
- Sitemap:http://www.xxx.com/sitemap_baidu.xml
複製代碼 問題是這份代碼適用於中文站用於百度,我是做英文站要適用於google, 以上代碼怎樣改成適用英文站的?
! D9 K9 T2 A+ S6 W$ k" @0 S對於代碼 一竅不通...
, Z% U# L1 f. p1 D8 @+ C8 T0 [6 c+ m2 v2 ^* j
主要疑問是31----47行的代碼,既然是英文站,這幾行代碼應該是允許的吧?中文站才禁止抓取?: G* H' F* J" k8 ?8 A, S3 @: X5 K
( Z. H0 o8 l% N2 C. o( }, e9 ~2 j
' j3 ]1 G! m ]& o
0 ?( n& t' [5 w5 j" _) C8 `# p
* S/ a. Y9 a) C# ]補充內容 (2013-12-22 17:43):
* q% V6 r- l8 s' l* m1 s沒這麼複雜,下面的就可以了
9 F r# t: U. W1 q7 ZSitemap: hxxp://www.xxx.com/sitemap.xml# N7 O2 f# F- o u; s4 z4 f" H8 k/ h
User-agent: * u( o, i0 `. A* d
Disallow: /cgi-bin/: r+ a, z. {. N, b. M, y
Disallow: /wp-*8 Y: F+ M, o- g" t4 O5 z! W2 ^ u
0 Y' T' z7 }; b4 H( {* q; c, [
補充內容 (2013-12-27 17:17):$ j/ p, Q. u# \ a `: K
http://blog.csdn.net/wallacer/article/details/654289 |