— Learn about Python Generator

Photo by Kelvin Yup on Unsplash

If you are like me who have a few months experience of Python, then you might be in the same position as me — know something about Python, but not enough. So I spent sometime to gather bits and pieces of scattered knowledge to dig deeper into Python. In this blog, the piece will be generators.

But first, remember about the Iterator we mentioned earlier? Generator closely links with the iterator concept:

generator

A function which returns a generator iterator. It looks like a normal function except that it contains yield expressions for producing a series of values usable in…


Chapter 6. Partitioning

Photo by Nathan Dumlao on Unsplash

This is a series of learning notes on Designing Data-Intensive Applications.

Replication, for very large datasets or very high query throughput is not sufficient, we need to break the data up into partitions (sharding).

Scalability is the main reason for partitioning data. It enables a large dataset to be distributed across many disks, and a query load can be distributed across many processors.

Copies of each partition are usually stored on multiple nodes. Each record belongs to exactly one partition, it may still be stored on several nodes for fault tolerance. A node may store more than one partition.


Chapter 5. Replication

Photo by Artem Beliaikin on Unsplash

This is a series of learning notes on Designing Data-Intensive Applications.

Reasons for replication:

  • Store data geographically close to your users
  • Improve availability
  • Improve throughput

The challenge is not storing the data, but handling changes to replicated data. Popular algorithms for replicating changes between nodes: single-leader, multi-leader, and leaderless replication.

Leaders and followers

Each node that stores a copy of the database is called a replica.

  1. One of the replicas is designated the leader. Writes to the database must send requests to the leader.
  2. Other replicas are known as followers. The leader sends the data change to all of its followers as part…

Chapter 4.Encoding and Evolution

Photo by Dilyara Garifullina on Unsplash

This is a series of learning notes on Designing Data-Intensive Applications.

Evolvability is important for maintainability in the long term, as applications inevitably change over time.

Rolling upgrade: A few nodes are updated at a time. No service downtime required. With rolling upgrades, new and old versions of the code/ data formats may all coexist at the same time.

  • Backward compatibility: newer code can read data written by older code.
  • Forward compatibility: older code can read data written by newer code.

Formats for encoding data

Two different representations:

  • In memory: data is kept in objects, structs, list, arrays, hash tables, trees, etc. …

Photo by Scott Eckersley on Unsplash

This is a course note of an online course I took relating to the Operating System on educative.io.

Log-structured file system

LFS, short for the Log-structured File System. When writing to disk, LFS first buffers all updates in an in-memory segment. When the segment is full, it is written to disk in one long, sequential transfer to an unused part of the disk. LFS never overwrites existing data, but rather always writes segments to free locations. Because segments are large, the disk (or RAID) is used efficiently, and the performance of the file system approaches its zenith. Later, LFS then reclaims that old…


Photo by Anton Nikolov on Unsplash

This is a course note of an online course I took relating to the Operating System on educative.io.

Chapter 5 Introduction to FSCK and Journaling

Crash Consistency

One major challenge faced by a file system is how to update persistent data structures despite the system crash.

Crash scenario

When appending to the file, there are 3 on-disk structures: the inode (which must point to the new block and record the new larger size due to the append), the new data block, and a new version of the data bitmap.

With only a single write succeeds, there are 3 possible outcomes:

  • Just the data block is written to disk. In…


Photo by Amy Shamblen on Unsplash

This is a course note of an online course I took relating to the Operating System on educative.io.

Chapter 4. Fast File System

The old UNIX file system design leads to fragmented file allocation and poor performance, as the free space was not carefully managed, e.g. a logically contiguous file would be accessed by going back and forth across the disk, thus reducing performance dramatically.

  • With a random access memory pattern, the free list becomes gradually more dispersed
  • Also the smaller the block size, the bigger the positioning overhead to transfer the data on the block

You can see what happens is that E gets…


Photo by Christian Coquet on Unsplash

This is a course note of an online course I took relating to the Operating System on educative.io.

Chapter 3. File and Directory

A persistent-storage device, such as a classic hard disk drive or a more modern solid-state storage device, stores information permanently . Unlike memory, whose contents are lost when there is a power loss, a persistent-storage device keeps such data intact.

Two key abstractions have developed over time in the virtualisation of storage.

  • File : A file is simply a linear array of bytes, each of which you can read or write.


Photo by Daniel Tuttle on Unsplash

This is a course note of an online course I took relating to the Operating System on educative.io.

Chapter 2. Hard Disk Drives

The drive consists of a large number of read/write sectors (512-byte blocks), which are numbered from 0 to n-1on a disk with n sectors (0 to n-1 is the address space of the drive). Many file systems will read or write 4KB at a time . However, when updating the disk, a single 512-byte write is atomic (i.e., it will either complete in its entirety or won’t complete at all).


Photo by Joseph Gonzalez on Unsplash

This is a course note of an online course I took relating to the Operating System on educative.io.

Chapter 1. I/O Devices

To connect I/O to Processor and Memory, we use “bus”. A bus is a shared communication link that uses one set of wires to connect multiple subsystems. Sometimes shared bus with memory, sometimes a separate I/O bu

E.Y.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store