本帖最後由 hardrock 於 2013-11-22 14:34 編輯
, |. V$ |% X! d" V. K! o: f5 M+ V
: S+ Z: A7 s3 G) irobots.txt文件要放在網站根目錄下,最基本的檢查方法就是用你的域名後面直接跟上robots.txt訪問,如果能訪問,那放置的位置就對了。
$ k! h+ g6 y# c# n$ m1 l ^0 s
* C- S" n: w+ ]. E- ?# i$ s找到份代碼,- User-agent: */ e7 U' ^/ N; o
- Disallow: /cgi-bin/
* Z5 W$ |0 ^/ r% \ - Disallow: /wp-admin/6 I6 a/ A- N6 r7 B4 E7 H
- Disallow: /wp-content/cache/: N: J( r9 W* X: n l3 }9 A3 d" I B6 b
- Disallow: /wp-content/languages/5 R e/ |3 E9 g) l. A
- Disallow: /wp-content/plugins/
% Y6 ?. {0 y0 ~9 p- M - Disallow: /wp-content/themes/
6 [7 Q) i2 b3 m' P! z* K: T - Disallow: /wp-content/upgrade/( {, X, f3 i1 l! d( @ i
- Disallow: /wp-includes/
/ w$ b6 M w9 L9 s) y - Disallow: /comments/
* P3 K0 _, K5 j4 s- k8 f - Disallow: /category/1 h$ {- [( F0 `/ j4 q: N; q8 b% B2 U
- Disallow: /tag/
. Z5 h& b! E; O/ F - Disallow: /page/
1 f& Q! A# A+ d - Disallow: /feed/2 a, V$ B; b; \6 Y# H
- Disallow: /author/! s/ Y! u8 P% @# f9 e
- Disallow: /trackback/
* b3 H9 I2 p3 h. Z) G. K5 {$ u' J - Disallow: /2010/
; u* m- T0 I2 Z2 t$ \ - Disallow: /2011/! Y- c, W* V" z1 {0 I2 O x
- Disallow: /2012/
. r$ o# a0 d2 X, }( P! F. r - Disallow: /2013/# Q5 T/ K% {' k# G4 l
- Disallow: /*/feed/
: B4 m$ E% \' [2 K' {& y - Disallow: /*/trackback/" Y/ n5 O! F3 |
- Disallow: /*?/ Y& q8 i& m! I3 m% H w# C4 o
- Disallow: /*/*?
7 W% i9 O( ]- T+ n \ - Disallow: /*/*/*?& i# P0 @9 H& v# k! y) r, |. Q2 Z( B
- Disallow: /*.php$
j* _: C( b1 d - Disallow: /*.js$
5 S P! o! A' l Q6 U, w - Disallow: /*.inc$. \/ p1 E0 z6 M7 ?( e# W
- Disallow: /*.css$" |& _) O/ |2 X; |
-
7 k% B+ n8 u6 x% a/ C/ q, J+ m - # Google Image0 y: n+ z% R5 h; z8 `* _6 z
- User-agent: Googlebot-Image! t- v. Z+ X" X0 e* C; t! l7 l, V( v
- Disallow:3 Z4 J& Y! u& X; N
- Allow: /2 H; a& Z) h: \2 ?3 i+ E) l
-
# X$ \+ m$ [) |( n6 b7 n z: Q9 I - # Google AdSense2 w2 a/ t, K- U* r V- |1 Z* X( s
- User-agent: Mediapartners-Google*
: [3 Z9 q! @! S0 X - Disallow:
- j# e& E+ v) c/ \: O - Allow: /
& M! E0 R$ `" l - : ^- y c# c7 ?
- # digg mirror
4 g# D& N9 r6 h4 E - User-agent: duggmirror
E$ b" u: {) n - Disallow: /
" Z- h/ R( B1 Z6 B) Q, R) g* U -
- J# d; D4 }9 [ - # Alexa archiver
. z) F; P* g8 K - User-agent: ia_archiver6 E* ^8 j) W9 o ?
- Disallow: /( G% R3 V: z6 ~& e4 ` b: d
- , h( v. R% a0 B( z( O, x2 I+ Z
- Sitemap:http://www.xxx.com/sitemap.xml. h2 T9 B9 X( O
- Sitemap:http://www.xxx.com/sitemap_baidu.xml
複製代碼 問題是這份代碼適用於中文站用於百度,我是做英文站要適用於google, 以上代碼怎樣改成適用英文站的?" X$ x( j% J% i2 A
對於代碼 一竅不通...
7 A, U* T( r* `) S# @' K% A2 q0 x' ^2 W- u
主要疑問是31----47行的代碼,既然是英文站,這幾行代碼應該是允許的吧?中文站才禁止抓取?' T9 ?6 `+ U! G; v0 ?7 B
, s: `. V+ t/ t! U L: t% |2 Q$ H2 c6 v. _# N/ k4 n. l) S
" s$ a1 j4 L9 s$ ^4 c* g
. I( ^0 M/ D# z% K, ?補充內容 (2013-12-22 17:43):
, H1 d% f& c6 l. q6 `沒這麼複雜,下面的就可以了
0 o0 C* ^2 L E5 T& {Sitemap: hxxp://www.xxx.com/sitemap.xml
4 s4 n3 w' b3 a1 W- uUser-agent: *6 i& z3 t8 X) `: T2 ^
Disallow: /cgi-bin/: |+ T0 B( Z0 Y$ }7 u$ K# E
Disallow: /wp-*
5 t" Q9 q1 m2 r, F2 R# j9 K- X/ {& c4 c, ?2 _# u
補充內容 (2013-12-27 17:17):# U$ Y% H$ |, U. F' n
http://blog.csdn.net/wallacer/article/details/654289 |