Alternative data is the data that published sources outside of the company offers unique and timely insights into the investment opportunities. The types of alternative data include but not be limited to the following:
Social media posts
With the explosive growth of data, it becomes more and more challenging to handle the alternative data by traditional ways such as Excel. That is how Big Data comes into the play. Big Data refers to datasets that is too big or too slow by the traditional data processing applications. For example, Carvana has 50k+ cars and is collected daily, which produces dataset in 0.5G per month, 6G a year and 30G in 5 years. If we deal with 100 sources, it will produce 50G data per month. It is very challenging for traditional ways to handle the big data. In the next post, I will talk about how Saturn Data handles big data by our scalable and cost-effective infrastructure.
Mobile App Scraping
One of the big data challenges is to capture data. The traditional way is to scrape data from web. Since 2017, mobile traffic mobile has taken the lead in the internet traffic (56% as of 2022) and thus companies spent much more efforts on mobile app development. A mobile app doesn’t function in the same way as a website. There are disadvantages for the web scraping solution:
Mobile app has more data than the web. For example, Grab doesn’t show all the products for a specific store while they do in the mobile app.
Some startups or small businesses don’t display their services or products on their websites, but they do in the mobile app. This makes web scraping impossible.
The UI in mobile app is neater and more organized than websites that makes the mobile app scraping more efficient.
Though mobile app scraping is the future way to scrape data but there are challenges in terms of security and cost. For example, most of apps require login and it brings challenges in maintaining user accounts and data compliance.
Web and Mobile scraping is legal if data is collected on public websites and mobile apps. In April 2022, the Ninth Circuit reaffirmed scraping data that is publicly accessible on the internet is not a violation of the Computer Fraud and Abuse Act, or CFAA. This clarifies the common misconceptions like "Web scraping is illegal", "Web scraping is operating in a grey are of low". As long as scrapers are ethical, the data collected by these scrapers is lawful and can be used in investments, marketing, economics, academic research and even personal price tracker for a product in the wish list.
To build an ethical mobile app scraper, we should follow the guidelines, aka data compliance. Data compliance is a process that identifies the applicable governance for data collection, transformation, storage, delivery, and other activities and ensures the compliance with terms for the crawled websites and privacy policies. The data compliance consists of these aspects:
Compliance Review of the terms and conditions associated with the websites crawled
Restrict the traffic to crawled websites and reduce the potential interference as possible
Do not scraping personal data and intellectual property
Alternative data is very useful in many areas and will help in boosting your investment, business and make your day-to-day life better. However, achieving the alternative data is very challenging, which requires sophisticated technologies and data compliance.
Saturn Data makes mobile app scraping simple and accessible to everyone. The price starts from $9.99. Contact us today!