About 2,200,000 results
Open links in new tab
  1. python - Why does Dask perform so slower while multiprocessing …

    Sep 6, 2019 · 36 dask delayed 10.288054704666138s my cpu has 6 physical cores Question Why does Dask perform so slower while multiprocessing perform so much faster? Am I using Dask the wrong …

  2. dask: difference between client.persist and client.compute

    Jan 23, 2017 · More pragmatically, I recommend using persist when your result is large and needs to be spread among many computers and using compute when your result is small and you want it on just …

  3. python - Why does dask take long time to compute regardless of the …

    Mar 24, 2022 · The reason dask dataframe is taking more time to compute (shape or any operation) is because when a compute op is called, dask tries to perform operations from the creation of the …

  4. Reading an SQL query into a Dask DataFrame - Stack Overflow

    May 24, 2022 · I'm trying create a function that takes an SQL SELECT query as a parameter and use dask to read its results into a dask DataFrame using the dask.read_sql_query function.

  5. How to see progress of Dask compute task? - Stack Overflow

    I would like to see a progress bar on Jupyter notebook while I'm running a compute task using Dask, I'm counting all values of id column from a large csv file +4GB, so any ideas? import dask.datafr...

  6. Strategy for partitioning dask dataframes efficiently

    Jun 20, 2017 · The documentation for Dask talks about repartioning to reduce overhead here. They however seem to indicate you need some knowledge of what your dataframe will look like …

  7. python - Difference between dask.distributed LocalCluster with threads ...

    Sep 2, 2019 · What is the difference between the following LocalCluster configurations for dask.distributed? Client(n_workers=4, processes=False, threads_per_worker=1) versus …

  8. python - dask: What does memory_limit control? - Stack Overflow

    Oct 4, 2021 · The link you posted says explicitly that it's a per worker limit $ dask-worker tcp://scheduler:port --memory-limit="4 GiB" # four gigabytes per worker process. And you get the …

  9. dask - Make Pandas DataFrame apply () use all cores? - Stack Overflow

    As of August 2017, Pandas DataFame.apply() is unfortunately still limited to working with a single core, meaning that a multi-core machine will waste the majority of its compute-time when you run df.

  10. How to transform Dask.DataFrame to pd.DataFrame?

    Aug 18, 2016 · How can I transform my resulting dask.DataFrame into pandas.DataFrame (let's say I am done with heavy lifting, and just want to apply sklearn to my aggregate result)?