Skip to content

Commit

Permalink
update documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
dpriedel committed May 4, 2021
1 parent 3ffb7f8 commit 7aa3ec9
Show file tree
Hide file tree
Showing 2 changed files with 21 additions and 20 deletions.
28 changes: 13 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,28 +1,24 @@
CollectEDGARData
Collector
================

Download manager for SEC EDGAR data files

July 17, 2017. Version 1.9 update:
May 4, 2021. Version 5.0 update:

It's been a while and things at the EDGAR site have changed.
It's been a while and things at the SEC EDGAR site have changed and so has this program.

The applications are now Poco application framework projects (pocoproject.org).
The applications are no longer Poco application framework projects. It's back to Boost.

There have been a number of changes due to compiler/boost/Poco udates. Every thing is current right now -- gcc 7.1,
boost 1-64, Poco 1.7.8p3.
There have been a number of changes due to compiler/boost/other library udates. See the 'building' file for changes.

Most importantly, the EDGAR site has changed from FTP access to HTTPS access. I have used the Poco library to
implement support for HTTPS access with SSL verification. (The SSL verification is turned off for now because
my local test setup does not have a 'valid' SSL cert.)

Also, I have refactored a lot of the code related to file downloads in preparation for the main feature to be
added next -- concurrent downloads. According the the EDGAR web site, one is allowed up to 10 requests per second.
I will be adding code to support this feature. This will make the code version 2.0.
I'm using the Boost Beast library to handle file downloads and HTTPS/SSL interactions. Still not using
SSL Certificate validation though.

The application now supports optional concurrent downloads. The SEC site has a limit of 10 connections per second.
The application allows you to use a higher number but you will likely be stopped the the site.

This project is part of a set of projects to make use of the SEC's EDGAR data filings available on Linux computers.
It is also a chance to explore using C++11 and to try out Test Driven Development with C++.
It is also a chance to explore using C++17 through 23 and to try out Test Driven Development with C++.

This program serves to download selected EDGAR filing files from the SEC's FTP site. The files to be downloaded
can be filtered by date range, form type (10-Q, 4, etc.) and ticker symbol. Files are downloaded to a
Expand All @@ -35,7 +31,9 @@ is the same. The only difference is how many records are in the index files.

EDGAR form files are identified by CIK number which is assigned by the EDGAR system. This program will also do a
ticker to CIK conversion. You can provide a file with a list of stock ticker symbols and the corresponding CIK will
be found. The results are saved in file on your system so you can reuse them in future runs.
be found. The results are saved in file on your system so you can reuse them in future runs. This feature used to do
repeated queries to the SEC site for the conversions but now, it just downloads the entire file which became available some
time ago.



13 changes: 8 additions & 5 deletions building
Original file line number Diff line number Diff line change
@@ -1,10 +1,13 @@

To build this application you need the following:
- compiler which supports c++1z. I used gcc 7.1. Clang 4.0 should work too.
- Boost libraries (www.boost.org). I used Boost 1.64
- Poco libraries (pocoproject.org). I used 1.7.8.p3
- cpp-json https://github.com/eteran/cpp-json. I used version 3.3.1

- my app_framework is no longer needed as Poco provides these facilities.
- compiler which supports C++20 (and a little bit of C++23) I used gcc 11.1.
- Boost libraries (www.boost.org). I used Boost 1.76 for regex, iostreams, program_options, json
- Range-v3 (https://github.com/ericniebler/range-v3)
- date (https://github.com/HowardHinnant/date)
- spdlog (https://github.com/gabime/spdlog)
- fmtlib (https://github.com/fmtlib/fmt)


Since I have multiple versions of compilers and boost on my system and they are located in
non-standard places, there are some makefile variables which need to be set to point to
Expand Down

0 comments on commit 7aa3ec9

Please sign in to comment.