Microsoft
Software
Hardware
Network
Question : Fetch special characters like "Ñ" and absolute URL from href attribute of anchors
Hi there
I used the following codes to fetch the source codes from the web page (assigned to url2 in the following codes) but got two painful problems.
1. odd characters or character missed, e.g. the name "ALBARIÑO ORGANISTRUN" become "ALBARIÿ ORGANISTRUN" if display in Notepad++ or "ALBARI ORGANISTRUN" if display in Notepad. waht I want is "ALBARIÑO ORGANISTRUN".
2. relative url, e.g. the href value of the anchor "Blancos" is "prodtype.asp?PT_ID=107&nu
mRecordPos
ition=1&st
rPageHisto
ry=cat&str
Keywords=&
strSearchC
riteria="
, But I really want is its absolute address like "
http://www.elcatavinos.co
m/tienda/p
rodtype.as
p?PT_ID=10
7&
numRecor
dPosition=
1&strPageH
istory=cat
&strKeywor
ds=&
strSea
rchCriteri
a=
"
Can any one help me sort them out?
Thanks in advance!
Jason
__________________________
__________
__________
__________
__
codes I used:
dim url1
dim url2
dim xmlhttp
dim datafile
dim FS
dim dataFileTs
dim i
dim cookie
url1 = "
http://www.elcatavinos.co
m/tienda/s
tore/dynam
icIndex.as
p?
sm=b1
"
url2 = "
http://www.elcatavinos.co
m/tienda/p
roduct.asp
?
numRecord
Position=5
&P_ID=2547
3&strPageH
istory=cat
&
strKeywor
ds=&Search
For=&PT_ID
=107
"
datafile = "c:\temp\test.dat"
set FS = Wscript.CreateObject("Scri
pting.File
SystemObje
ct")
set datafileTs = FS.CreateTextFile(datafile
, True, True)
set xmlHTTP = Wscript.CreateObject("MSXM
L2.XMLHTTP
.3.0")
xmlHTTP.Open "HEAD",url1, false
xmlHTTP.Send
i = 0
do until xmlHTTP.readyState = 4
Wscript.Sleep 100
i = i + 1
if i > 1000 then exit do
Loop
cookie = xmlhttp.getResponseHeader(
"set-cooki
e")
xmlHTTP.Open "GET",url2, false
xmlHTTP.SetRequestHeader "set-cookie",cookie
xmlHTTP.SetRequestHeader "Content-Type","text/html;
charset=iso-8859-1"
xmlHTTP.SetRequestHeader "Content-Location","absolu
teURI"
xmlHTTP.Send
i = 0
do until xmlHTTP.readyState = 4
Wscript.Sleep 100
i = i + 1
if i > 1000 then exit do
Loop
datafileTs.Writeline xmlhttp.responseText
Answer : Fetch special characters like "Ñ" and absolute URL from href attribute of anchors
> .. href attribute of every anchor should already contain its domain ..
I already explained
http:#22674364
that the client cannot enforce this, the server needs to do it
Random Solutions
Single Quote Escape Character
Login is very slow - possible auditing problem?
Sharepoint?
Event Viewer Event ID 213 Replication of license information failed because the licence logging server on server could not be contacted
OSX and Apple talk protocol on Windows 2000 Server
how to create a Active X control out of C/C++ code
How do I Use Powershell to create Multiple Global, Domain Local, and Universal Distribution Active Directory Groups?
Properties dialog does not appear
Re-installation of WSS 3.0 fails -> Database Error
SP2