EDGAR – Assessing Available Data

Accessing the EDGAR FTP, I was able to download all SEC filings for CQ3 2014 which includes about 206,000 lines of data. It should not be a problem to scale the dataset to include previous years. For now, I focused on CQ3 2014, from a high level.

EDGAR classifies each company by CIK code. By accessing Rank and Filed, I was able to download an index of CIK codes that map to US Exchange Tickers. I integrated this index with my dataset, and now am able to identify companies on a more universal basis (tickers).

I logged the frequency of each form filing, and chose to isolate Form 3, Form 4, and 13-D.

Figure 3.1.2.1: Frequency of Form 3 Filings, CQ3 2014

Picture2_sec

Figure 3.1.2.2: Frequency of Form 4 Filings, CQ3 2014

Picture1_sec

Figure 3.1.2.3: Frequency of Form 13D/A Filings, CQ3 2014

Picture3_sec

I ran a quick regression between frequency of form filings (4, 3, SC 13D/A, 4/A, SC 13G/A, DEFA 14A, SC 13G, SC 13D) and the SP500 to see if there were any general relationships between frequency of filing and broad index performance.

Figure 3.1.2.4: SP500, 7/1/14 – 9/30/14

Picture4_sec

Figure 3.1.2.5: Regression Analysis, SP500 v. Form Filing Frequency

Picture5_sec

On a broad scale, the correlation is null. Delving in to more micro analysis, I sought to determine 10 companies I’d test my analysis upon. I incorporated each Company’s market capitalization from FactSet to screen for inactive companies and cleanse my dataset. I reduced my data set in the following order:

  • Began with 13,063 unique tickers
  • Excluded Companies with 0 market cap; remaining = 7,427 companies
  • Counted number of companies with market capitalization between $500MM and $2,000MM

I then sought to define what consituted an “active” filer. I parsed out the average and median filing frequencies for companies at different intervals.

Figure 3.1.2.6: Frequency of Filings at Different Market Capitalizations

Picture6_sec

I began to hone in on my target Company profile: somewhere between the range of $500MM and $5,000MM. I also took a sample of the aggregate sample size, measuring the first and third quartiles of filings for Companies with market cap > $500MM excluding companies that haven’t filed (active Companies are required to file very quarter).

Figure 3.1.2.7: First and Third Quartile: Number of Filings with SEC for Companies with > $500MM Market Cap

Picture7_sec

At this point in time, I defined a frequent filer as one that posted more than 23 filings per quarter. I justified this because my sample population of companies has a median of less than 13 filings per quarter, and only almost hitting 20 when including skewed averages.

The following companies are the companies I have chosen to analyze

Figure 3.1.2.8: EDGAR Project Universe v1

Picture8_sec

Follow-Up

  • Develop scalable time series model to observe price reaction relative to Form 3 + 4 filings.
    • Graphical representation for 3Q 2014 (linear) & date of form filing (point)
      • Record top/bottom 3 dates for price reaction
      • Count if form filing occurs within 3-4 days of resultant reaction, simple percentage basis.
      • End goal is to parse occurence within +/- 1, 2, 3, 4 days of filing over universe of 50 companies and record results.
      • If a relationship is established, structure data in terms of 1) classifications between gains / loss; 2) magnitude of gains/ loss per period observed for relevant period 3) normalize for corresponding SP500 gains/loss to eliminate market counfound
  • Count only unique occurences of filings and record titles of sellers (buyers)