Software Engineer | Technical Evangelist

Blog PostsResume


Partitioning is an important technique for organizing datasets so they can be queried efficiently. It organizes data in a hierarchical directory structure based on the distinct values of one or more columns. By default, a DynamicFrame is not partit...



Amazon SageMaker notebook instance is a managed ML compute instance that runs the Jupyter Notebook Application. The Jupyter notebook enables you to fetch raw files and download them, and even exposes a *download* button. Due to security and compliant...



SageMaker Notebook Instances do not publish any metrics to CloudWatch unlike other SageMaker components like Endpoints. This prevents us from observing any metrics and in turn creating alarms on those metrics. However, considering the fact that we h...


Glue is an Amazon provided and managed ETL platform that uses the open source Apache Spark behind the back. When you write a DynamicFrame ton S3 using the `write_dynamic_frame()` method, it will internally call the Spark methods to save the file. Sin...




Docker allows us to package and run applications as an isolated process on a shared operating system, acting as a lighter weight alternative to virtual machines. The reason I chose to _dockerise_ my blog was the same as everybody else, 'speeding up t...


© 2019 | Ujjwal Bhardwaj. All Rights Reserved. (Source code)