CloudWatchEvents let users create Rules for EMR cluster for events including State Change and EMR Configuration Error. With the EMR State Change event, CWEvents let users pulbish notifications for the creation of EMR clusters, among other states…
When we try to implement Cloudwatch notifications via SNS/email that are triggered when a specific EMR cluster has step state changes, the event pattern structure works when used with clusterId but fails when used with ClusterARN/cluster name. There…
The combination of Spark and Parquet is a very popular foundation for building scalable analytics platforms. In particular performance, scalability and ease of use are key elements of this solution that make it very appealing to its users. Predicate…
Amazon Athena added support for Views with the release of a new version on June 5, 2018 allowing users to use commands like CREATE VIEW, DESCRIBE VIEW, DROP VIEW, SHOW CREATE VIEW, and SHOW VIEWS in Athena. The query that defines the view runs each…
Amazon Elastic MapReduce (EMR) is a web service that uses Hadoop to quickly & cost-effectively process vast amounts of data. It helps us analyze and process vast amounts of data by distributing the computational work across a cluster of virtual…
Amazon EC2 is a web service that provides secure, resizable compute capacity in the cloud. It is a virtual machine deployed on the cloud which can be used for various computations. At the crux of it, EC2 is a Linux/windows instance on the cloud and…
Hive is an open-source, data warehouse, and analytics package that runs on top of a Hadoop cluster. Hive scripts use a SQL-like language called Hive QL (query language) that abstracts programming models and supports typical data warehouse…
I am a Software Development Engineer currently working at Amazon. Before now, I worked at Amazon Web Services, as a Big Data Cloud Support Associate, after completing bachelor's degree (B.Tech. in Computer Science) in 2018. I have experience in AWS…
Yaydoc, our automatic documentation generation and deployment project, generates and deploys documentation for each of its registered repositories. These repositories registered to Yaydoc have various configurable settings which can be edited to…
Yaydoc, our automatic documentation generation and deployment project, generates and deploys documentation for each of its registered repository. For every commit made to the registered repository, there is a corresponding build process running at…