![]() For the dual-core process (right), each node takes on 5 tasks, thereby doubling the processing speed. It slices your DataFrame into different parts such that each part can be sent to a different CPU core. ![]() Modin partitions the DataFrames across both the rows and the columns. This makes Modin’s parallel processing scalable to DataFrames of any shape. Imagine if you are given a DataFrame with many columns but fewer rows. Some libraries only perform the partitioning across rows, which would be inefficient in this case since we have more columns than rows. But with Modin, since the partitioning is done across both dimensions, the parallel processing remains efficient all shapes of DataFrames, whether they are wider (lots of columns), longer (lots of rows), or both.Ī Pandas DataFrame (left) is stored as one block and is only sent to one CPU core. A Modin DataFrame (right) is partitioned across rows and columns, and each partition can be sent to a different CPU core up to the max cores in the system. Modin actually uses a Partition Manager that can change the size and shape of the partitions based on the type of operation.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |