Skip to main content

scripts that allows me to connect to a page or has a method to download file from the web

Hello everyone,
Someone has a scripts that allows me to connect to a page and download an excel document and the same script allows me to save it in a directory, and then export it to IDEA, or has a method to download file from the web

Brian Element Wed, 04/20/2022 - 11:35

Something like this would probably be easier to do in Python then IDEAScript as Python has modules that allow you to interact with a web page.

klmi Wed, 04/27/2022 - 08:40

It's my experience that you have to solve a lot of problems if you would like to do that with IDEA's Python environment.

1) Web scraping:


import requests
r = requests.get("http://www.yourdomain.xxx/yourfile.xlsx")

However the module "requests" isn't included in IDEA's standard installation.

I don't know whether module "urllib" would be an alternative (or there may be dependencies) and can't test that at the moment:


import urllib
urllib.request.urlretrieve("http://www.yourdomain.xxx/yourfile.xlsx", "outputfile.xlsx")

2) Excel import to Dataframe:

A simple import of Excel files with "pandas.read_excel" is not possible because the module "xlrd" is missing in IDEA's Python environment. When you try to download "xlrd" from Github you will find only the newest version, which brings us to the next problem that newer versions of "xlrd" don't support *.xlsx files (only old Excel format *.xls). Actually it's possible to set an engine option with "pandas.read_excel". However the older pandas version which comes with IDEA does only support engine=xlrd or engine=None. But the module "openpyxl" which is included in IDEA's Python environment isn't supported by that pandas version. To make it short - the only solution I found to import Excel files directly into Python resp. a pandas.DataFrame is "openpyxl".

Due to performance aspects I don't recommend a simple implementation like:


from openpyxl import load_workbook
import pandas as pd
wb=load_workbook (filename)
ws=wb[sheetname]
df=pd.DataFrame(ws.values)

The use of "openpyxl" built in iterators makes it a lot faster!

scotchy33 Thu, 04/28/2022 - 12:43

Someting like this should work.
Sub Main
 
Dim WinHttpReq As Object
Dim oStream As Object
Dim sURL As String
Dim savePathFile As String
 
 
Set WinHttpReq = CreateObject("Microsoft.XMLHTTP")
 
sURL = "https://www.ideascripting.com/sites/default/files/ukx-with-selection.zip"
 
savePathFile = "C:\ukx-with-selection.zip"
 
WinHttpReq.Open "GET", sURL, False
 
WinHttpReq.send
 
If WinHttpReq.Status = 200 Then
 
Set oStream = CreateObject("ADODB.Stream")
oStream.Open
oStream.Type = 1
oStream.Write WinHttpReq.responseBody
oStream.SaveToFile (savePathFile)
oStream.Close
 
End If
 
Set oStream = Nothing
Set WinHttpReq = Nothing
 
End Sub