Introduction What data engineers do? Data engineers: Data management challenges: Essentially, the role of the data engineer is to get data from sources, make is useful and serve the data to stakeholders. Data Discovery Data engineer responsibilities: AWS Data Services and Modern Infrastructure Basic workflow of modern data engineering building blocks: Orchestration and Automation Options […]
Tag: AWS
My Notes – Machine Learning
Algorithms Measures Low variance vs high variance (high variance is good for model). Dimensionality Reduction Hyperparameter Tuning AWS Services
AWS Certified Developer – Associate (DVA-C02)
My reference page for notes produced when studying AWS material. Well-Architected Framework Let’s start with AWS well-architected framework because in AWS everything touches that concept. The six pillars of well-architected framework includes: Cloud Design Patterns Security IAM – concept of users and policies. Policies are assigned to users. Can use existing policies or create new […]
SSL Offloading – AWS Application Load Balancer
Did you know that AWS Application Load Balancer (ALB) supports SSL offloading out of the box! To start create Amazon issued certificate from AWS Certificate Manager. Then, assign this certificate into ALB HTTPS:443 listener and route the traffic to a target group which is set to listen on port HTTP:80. One great benefit of such […]
Data Science – Machine Learning
Following are my personal notes that prepared me to take the AWS Certified Machine Learning – Specialty (MLS-C01) exam. Data Processing Encode Values Encoding categorical variables: (panda supports dtype="category") Encoding nominal variables: Handling Missing Values Missing values are very common. Machine learning algorithms can’t handle missing values automatically. Data with too many missing values can’t […]
Essentials – Machine Learning
Following are my personal notes that prepared me to take the AWS Certified Machine Learning – Specialty (MLS-C01) exam. Terminology and Process Effective IA Strategy The flywheel of data (positive feedback loop): more data means better analytics -> better analytics results in better products -> better products means more users -> that generates more data. […]