Search: [ProfilingPerfsComplexity] - Liens utiles et à partager

New CVM algorithm - Counting Distinct Elements in Streams: An Algorithm for the (Text) Book - arXiv

A new count-distinct algorithm:

We present a simple, intuitive, sampling-based space-efficient algorithm whose description and the proof are accessible to undergraduates with the knowledge of basic probability theory.

Donald Knuth likes it: https://www-cs-faculty.stanford.edu/~knuth/papers/cvm-note.pdf

Their algorithm is not only interesting, it is extremely simple.
Furthermore, it’s wonderfully suited to teaching students who are learning the basics of computer science.
I’m pretty sure that something like this will eventually become a standard textbook topic.

There is the CWEB implementation he produced: cvm-estimates.w (archive.org)

Source: https://jmason.ie/2024/05/21/165901a.html

Interesting HackerNews comments: https://news.ycombinator.com/item?id=40379175

arXiv · algorithm · stream · Searching_&_Sorting · maths · ProfilingPerfsComplexity · probability · HackerNews

September 3, 2024 03:07:38 PM GMT+02:00 * · permalink

·

https://arxiv.org/abs/2301.10191

How does Shazam work - Coding Geek

Have you ever wondered how Shazam works? I asked myself this question a few years ago and I read a research article written by Avery Li-Chun Wang, the confounder of Shazam, to understand the magic behind Shazam. The quick answer is audio fingerprinting, which leads to another question: what is audio fingerprinting?
This article is a summary of the search I did to understand Shazam.

Source: https://jmason.ie/2024/07/09/205901a.html

archive.org · shazam · algorithm · ProfilingPerfsComplexity · music · Fingerprinting · Searching_&_Sorting

September 3, 2024 02:47:32 PM GMT+02:00 * · permalink

·

https://web.archive.org/web/20150713235721/http://coding-geek.com/how-shazam-works/

Introduction — pymemtrace 0.1.1 documentation

This describes tools and techniques that can identify memory leaks in Long running Python programs:

Is it a Leak?

Sources of Leaks

A Bit About (C)Python Memory Management

Reference Counts

Garbage Collection

The Big Picture

CPython’s Object Allocator (pymalloc)

Here is a visualisation of memory allocators from top to bottom (from the Python source Objects/obmalloc.c):

    _____   ______   ______       ________
   [ int ] [ dict ] [ list ] ... [ string ]       Python core         |
+3 | <----- Object-specific memory -----> | <-- Non-object memory --> |
    _______________________________       |                           |
   [   Python's object allocator   ]      |                           |
+2 | ####### Object memory ####### | <------ Internal buffers ------> |
    ______________________________________________________________    |
   [          Python's raw memory allocator (PyMem_ API)          ]   |
+1 | <----- Python memory (under PyMem manager's control) ------> |   |
    __________________________________________________________________
   [    Underlying general-purpose allocator (ex: C library malloc)   ]
 0 | <------ Virtual memory allocated for the python process -------> |

   =========================================================================
    _______________________________________________________________________
   [                OS-specific Virtual Memory Manager (VMM)               ]
-1 | <--- Kernel dynamic storage allocation & management (page-based) ---> |
    __________________________________   __________________________________
   [                                  ] [                                  ]
-2 | <-- Physical memory: ROM/RAM --> | | <-- Secondary storage (swap) --> |

Python · CPython · memory · MemoryAllocation · garbage-collection · leak · ProfilingPerfsComplexity · performances · C_Profiling · Python_internals:_breaking/debugging/sandboxing · C_&_C++

March 13, 2023 01:13:07 PM GMT+01:00 * · permalink

·

https://pymemtrace.readthedocs.io/en/latest/memory_leaks/introduction.html

Walmart Node.js Memory Leak | Joyent

He found that the more HTTP client requests he did, the more memory his Node process would consume, but it was really slow.

[...] Then I ran Node with UMEM_DEBUG set to record various important information about the memory allocations

[...] Every hour, it grabbed the output of pmap -x and a core file and stored those in Joyent Manta

[...] In MDB there's a particularly helpful command ::findleaks that will show you the memory addresses and the stack traces for leaked memory, not unlike using valgrind, but without all the performance penalty.

[...] At this point we knew that we were looking for something in v0.10 that called MakeCallback but that didn't first have a HandleScope on the stack. I then worked up this simple DTrace script.

Perfs/Profiling/Debug · ProfilingPerfsComplexity · performances · JS_Perfs · NodeJS · node-js · C_Profiling · pmap · leak · memory · MemoryAllocation · valgrind · DTrace

February 1, 2023 11:39:43 AM GMT+01:00 * · permalink

·

https://tritondatacenter.com/blog/walmart-node-js-memory-leak

How fast are Linux pipes anyway?

In this post, we will explore how Unix pipes are implemented in Linux by iteratively optimizing a test program that writes and reads data through a pipe.1

We will begin with a simple program with a throughput of around 3.5GiB/s, and improve its performance twentyfold.

C_&_C++ · Perfs/Profiling/Debug · ProfilingPerfsComplexity · optimization · throughput · Linux

June 8, 2022 11:37:10 PM GMT+02:00 * · permalink

·

https://mazzo.li/posts/fast-pipes.html

USE Method: Linux Performance Checklist

The USE Method provides a strategy for performing a complete check of system health, identifying common bottlenecks and errors. For each system resource, metrics for utilization, saturation and errors are identified and checked. Any issues discovered are then investigated using further strategies.
This is an example USE-based metric list for Linux operating systems (eg, Ubuntu, CentOS, Fedora). This is primarily intended for system administrators of the physical systems, who are using command line tools. Some of these metrics can be found in remote monitoring tools.

Perfs/Profiling/Debug · ProfilingPerfsComplexity · Prog · Linux_SysAdmin · Linux

January 8, 2018 03:43:44 PM GMT+01:00 · permalink

·

http://www.brendangregg.com/USEmethod/use-linux.html

Notes on CPython Lists

TL;DR:

List Resizing : the backing array is grown by approximately 12% (in Java ArrayList grows by 50% when expanded2 and in Ruby, Array grows by 100%). The Python implementation optimizes for memory usage over speed. Another reason to preallocate Python lists when possible.
Inserting at the beginning of a list takes linear time. Sometimes, better use Deques which trade constant time insert and remove from both ends in exchange for constant time indexing.

By Russel Cohen

Python · CPython · ProfilingPerfsComplexity · DataStructures · ProgLanguages · Prog · performances

January 8, 2018 03:17:59 PM GMT+01:00 * · permalink

·

https://rcoh.svbtle.com/notes-on-cpython-lists

Profiling Python in Production

Nylas provides a modern developer
platform for email, contacts, and calendar. Stop fighting old protocols,
and start building great products

ProgLanguages · Python · ProfilingPerfsComplexity · performances

October 7, 2015 10:18:44 PM GMT+02:00 * · permalink

·

https://nylas.com/blog/performance

Comparing the speed of CPython, Brython, Skulpt and pypy.js | brythonista

Disclaimer I am the creator and main developer of Brython. I am aware that this makes me suspect of partiality ! The test conditions are explained in detail below so that anyone can easily reproduce them and compare with the results presented here ; if something is wrong please post a comment and I will…

ProgLanguages · Python · ProfilingPerfsComplexity · performances

April 22, 2015 11:19:46 PM GMT+02:00 * · permalink

·

https://brythonista.wordpress.com/2015/03/28/comparing-the-speed-of-cpython-brython-skulpt-and-pypy-js/

Numba vs Cython: How to Choose | Climate Science & Engineering

My name is Stephan, and I'm a scientist on the Climatology team at The Climate Corporation. We make extensive use of Python to build statistical weather models, and sometimes we need our code to be fast. Here's how I choose between Numba and Cython, two of the best options for accelerating numeric Python code. Most…

ProgLanguages · Python · ProfilingPerfsComplexity · performances

April 16, 2015 07:57:19 AM GMT+02:00 * · permalink

·

http://eng.climate.com/2015/04/09/numba-vs-cython-how-to-choose/