Kamis, 26 Mei 2016

[smf_addin] Digest Number 3710

4 Messages

Digest #3710
1b
Re: Quick Examination Worksheet by "Randy Harmelink" rharmelink
2a
2b
Re: ETF Specific Data Items by "Randy Harmelink" rharmelink

Messages

Wed May 25, 2016 8:30 am (PDT) . Posted by:

jterrenceho

Randy,


I believe you had put the data in the workbook as an example.


Would you be able to take any particular website and show how the workbook works?


I was able to understand how all the other templates works, but this one was a bit confusing.


BTW, the plug-in is excellent!! And I was able to build workbooks to perform challenging computations.


Wed May 25, 2016 6:19 pm (PDT) . Posted by:

"Randy Harmelink" rharmelink

​Well, it's not really a template. More of a tool to see how some of​ the
add-in functions might operate on a given web page and to examine the web
page itself.

Let me walk through my last usage of the workbook. I was using the "Table
Extract" worksheet.

A question was asked about extracting executive compensation from this
MorningStar web page:

http://insiders.morningstar.com/trading/executive-compensation.action?t=MMM

The first thing I did was look at the source code of the web page and see
that the table in question isn't actually there, because MorningStar is
dynamically creating the web page. By using a URL sniffer -- FireFox is my
browser, so I used the HTTPFox extension -- and found out the data actually
came from this web page:


http://insiders.morningstar.com/trading/executive-compensation-list.action?t=MMM

A very messy looking page, because it's used to build the other page, so
presentation doesn't really matter. When I looked at the source code of
that web page, I see a lot of tables. So I'm curious what they will look
like extracted.

So I go to the "Table Extract" worksheet, put that last URL into cell C6,
delete the value in cell C6 and set ​C8 and C10 to 1. This should extract
the first table on the web page.

Hmmm. Just a header line, so I change cell C10 to be a 2, which should
display the first 2 tables on the web page. Ah, more data. Let's try 3
pages. So that looks like the summary of all executives. I extended the
array-entered function to several hundred rows so I could see more
extracted data, then increase the value of C10 one value at a time to see
what comes up. The 4th table is just another header, and the 5th table is
the additional data for that header. And the pattern repeats. Each
executive is 2 tables.

So, now I know I can use these formulas to grab each set of data:

Summary entry is tables 1 thru 3, so: =RCHGetHTMLTable(URL,"",1,"",3)
First executive is tables 4 and 5, so: =RCHGetHTMLTable(URL,"",4,"",5)
Next executive is tables 6 and 7, so: =RCHGetHTMLTable(URL,"",6,"",7)

...and so forth. Now, the last question is how to get all tables at once.
But if I try an arbitrarily high number in cell C10, say 20, it doesn't get
all the data. So now I know I need to find a way to either identify how
many tables there are or to find a way to identify the end of the last
table. The latter is more in line with the searching method of the add-in
functions, so I examine the source code of the web page for a search term
after the last table. I see there is some <script> code at the end of the
page, and check to see if one is earlier on the web page. Nope, so I'll try
that. So I put "<script>;" in cell C9. And I know I need to go backward from
there to find the end of the HTML table immediately before it, so I put -1
in cell C10.

Voila! All the data is there. So now I know my formula to grab ALL the
tables at once is:

=RCHGetHTMLTable(URL,"",1,"<script>",-1)

So, like I said, it's sometimes a task that requires both looking at the
source code and doing some trial and error. The advantage of using this
worksheet is that it's VERY easy to play around with the parameters to see
what works and what gets returned. If I run into issues extracting there, I
may have to go back to the source and see why. And it may mean a different
form of extraction needs to be done.

Another thing I could have done early on, when I saw all the tables on the
web page with the data, is go to the "By HTML Tag" worksheet, put the URL
into cell C7 and left "table" in cell C8. That shows me that the first
table on the page has different structure than the rest, and that then
there seem to be pairs of tables with the same structure. So that could
have been a way for me to know on the "Table Extract" worksheet to try 1
thru 3, 4 thru 5, 6 thru 7, and so forth. But I still might have looked at
some of the tables individually to see what they contained. An easy way to
look at tables one by one is to put "=C6" in cell C8, so that I only have
to change values in cell C6. Then C8 matches that value and I get just that
one table.

Now, all of that might sound like a lot of work, but it saves me from
having to do a lot of other work, and constantly re-entering the array
forrmula. And I've used most of these worksheets so many times that the
whole process above took me about 20 minutes to do. It took me less time
than writing this up! :)

On Wed, May 25, 2016 at 7:16 AM, jterrenceho@
​...wrote:

>
> I believe you had put the data in the workbook as an example.
>
> Would you be able to take any particular website and show how the workbook
> works?
>
> I was able to understand how all the other templates works, but this one
> was a bit confusing.
>
> BTW, the plug-in is excellent!! And I was able to build workbooks to
> perform challenging computations.
>
>

Wed May 25, 2016 9:30 am (PDT) . Posted by:

norton1717

Has anyone done any work on ETF Specific data items. Some items of interest are AUM, Exp Ratio, Avg Daily Vol and Spread.

Wed May 25, 2016 12:33 pm (PDT) . Posted by:

"Randy Harmelink" rharmelink

Examples of data from Yahoo based on the items you asked about:

=1000*RCHGetTableCell("http://finance.yahoo.com/q?s=SPY",1,">Net Assets")

=RCHGetTableCell("http://finance.yahoo.com/q/pr?s=SPY",1,"Expense Ratio")

=RCHGetTableCell("http://finance.yahoo.com/q?s=SPY",1,">Avg Vol")

=RCHGetTableCell("http://finance.yahoo.com/q?s=SPY",1,">NAV")

Wasn't sure what you meant by spread, so I picked up the NAV?

On Wed, May 25, 2016 at 6:57 AM, rnorton@me.com [smf_addin] <
smf_addin@yahoogroups.com> wrote:

> Has anyone done any work on ETF Specific data items. Some items of
> interest are AUM, Exp Ratio, Avg Daily Vol and Spread.
>
For the Add-in, Documentation, Templates, Tips and FAQs, visit http://ogres-crypt.com/SMF

Tidak ada komentar:

Posting Komentar