🐍

Python

Master Python programming with tutorials, libraries, and best practices

1590
Total Items
114
Papers
120
Videos
9
Books
1316
Blogs
19
News
7
Podcasts
Browsing by date

Navigate through content by publication date

Sat, Nov 15

16 items found

1 of 64
📄 paper

An insight into the technical debt-fix trade off in software backporting

Maintaining software is an ongoing process that stretches beyond the initial release. Stable software versions continuously evolve to fix bugs, add improvements, address security issues, and ensure compatibility. This ongoing support involves Backporting, which means taking a fix or update from a newer version and applying it to an older version of the same software. As software versions evolve, new technical debt can arise during backport maintenance activities. This study examines the technical debt involved in fixing 105,396 commits from 31,076 backport sources across 87 repositories in three software ecosystems (Apache, Eclipse, and Python). The goal is to identify when and why new technical debt arises during backporting in stable source code. Our results indicate that approximately 4.3% of backports introduce new technical debt. Apache contributes the most absolute instances, while Python and Eclipse exhibit nearly three times higher debt-to-commit ratios than Apache. Feature migrations make older Apache releases debt-prone in the early phase, whereas Python and Eclipse releases tend to accumulate technical debt mostly during the middle phase of their release cycles. Additionally, developers who are inexperienced, under high workloads, or non-owners are more likely to introduce technical debt during backporting.

Python advanced
By: Jarin Tasnim, Debasish Chakroborti, Chanchal K. Roy +1 more
Source: arXiv Nov 11, 2025
0.0
5 min read
0
Quality
📄 paper

Evaluating Software Process Models for Multi-Agent Class-Level Code Generation

Modern software systems require code that is not only functional but also maintainable and well-structured. Although Large Language Models (LLMs) are increasingly used to automate software development, most studies focus on isolated, single-agent function-level generation. This work examines how process structure and role specialization shape multi-agent LLM workflows for class-level code generation. We simulate a Waterfall-style development cycle covering Requirement, Design, Implementation, and Testing using three LLMs (GPT-4o-mini, DeepSeek-Chat, and Claude-3.5-Haiku) on 100 Python tasks from the ClassEval benchmark. Our findings show that multi-agent workflows reorganize, rather than consistently enhance, model performance. Waterfall-style collaboration produces cleaner and more maintainable code but often reduces functional correctness (-37.8\% for GPT-4o-mini and -39.8\% for DeepSeek-Chat), with Claude-3.5-Haiku as a notable exception (+9.5\%). Importantly, process constraints shift failure characteristics: structural issues such as missing code decrease, while semantic and validation errors become more frequent. Among all stages, Testing exerts the strongest influence by improving verification coverage but also introducing new reasoning failures, whereas Requirement and Design have comparatively modest effects. Overall, this study provides empirical evidence that software process structure fundamentally alters how LLMs reason, collaborate, and fail, revealing inherent trade-offs between rigid workflow discipline and flexible problem-solving in multi-agent code generation.

Python advanced
By: Wasique Islam Shafin, Md Nakhla Rafi, Zhenhao Li +1 more
Source: arXiv Nov 12, 2025
0.0
5 min read
0
Quality
📄 paper

Quality Assurance of LLM-generated Code: Addressing Non-Functional Quality Characteristics

In recent years, LLMs have been widely integrated into software engineering workflows, supporting tasks like code generation. However, while these models often generate functionally correct outputs, we still lack a systematic understanding and evaluation of their non-functional qualities. Existing studies focus mainly on whether generated code passes the tests rather than whether it passes with quality. Guided by the ISO/IEC 25010 quality model, this study conducted three complementary investigations: a systematic review of 108 papers, two industry workshops with practitioners from multiple organizations, and an empirical analysis of patching real-world software issues using three LLMs. Motivated by insights from both the literature and practitioners, the empirical study examined the quality of generated patches on security, maintainability, and performance efficiency. Across the literature, we found that security and performance efficiency dominate academic attention, while maintainability and other qualities are understudied. In contrast, industry experts prioritize maintainability and readability, warning that generated code may accelerate the accumulation of technical debt. In our evaluation of functionally correct patches generated by three LLMs, improvements in one quality dimension often come at the cost of others. Runtime and memory results further show high variance across models and optimization strategies. Overall, our findings reveal a mismatch between academic focus, industry priorities, and model performance, highlighting the urgent need to integrate quality assurance mechanisms into LLM code generation pipelines to ensure that future generated code not only passes tests but truly passes with quality.

Python advanced Artificial Intelligence
By: Xin Sun, Daniel Ståhl, Kristian Sandahl +1 more
Source: arXiv Nov 13, 2025
0.0
10 min read
0
Quality
📄 paper

SPADA: A Spatial Dataflow Architecture Programming Language

Spatial dataflow architectures like the Cerebras Wafer-Scale Engine achieve exceptional performance in AI and scientific applications by leveraging distributed memory across processing elements (PEs) and localized computation. However, programming these architectures remains challenging due to the need for explicit orchestration of data movement through reconfigurable networks-on-chip and asynchronous computation triggered by data arrival. Existing FPGA and CGRA programming models emphasize loop scheduling but overlook the unique capabilities of spatial dataflow architectures, particularly efficient dataflow over regular grids and intricate routing management. We present SPADA, a programming language that provides precise control over data placement, dataflow patterns, and asynchronous operations while abstracting architecture-specific low-level details. We introduce a rigorous dataflow semantics framework for SPADA that defines routing correctness, data races, and deadlocks. Additionally, we design and implement a compiler targeting Cerebras CSL with multi-level lowering. SPADA serves as both a high-level programming interface and an intermediate representation for domain-specific languages (DSLs), which we demonstrate with the GT4Py stencil DSL. SPADA enables developers to express complex parallel patterns -- including pipelined reductions and multi-dimensional stencils -- in 6--8x less code than CSL with near-ideal weak scaling across three orders of magnitude. By unifying programming for spatial dataflow architectures under a single model, SPADA advances both the theoretical foundations and practical usability of these emerging high-performance computing platforms.

Python advanced
By: Lukas Gianinazzi, Tal Ben-Nun, Torsten Hoefler
Source: arXiv Nov 12, 2025
0.0
10 min read
0
Quality
📄 paper

Cyclotron: Compilation of Recurrences to Distributed and Systolic Architectures

We present Cyclotron, a framework and compiler for using recurrence equations to express streaming dataflow algorithms, which then get portably compiled to distributed topologies of interlinked processors. Our framework provides an input language of recurrences over logical tensors, which then gets lowered into an intermediate language of recurrences over logical iteration spaces, and finally into programs of send, receive, and computation operations specific to each individual processor. In Cyclotron's IR, programs are optimized such that external memory interactions are confined to the boundaries of the iteration space. Within inner iteration spaces, all data accesses become local: data accesses target values residing in local fast memory or on neighboring processing units, avoiding costly memory movement. We provide a scheduling language allowing users to define how data gets streamed and broadcasted between processors, enabling pipelined execution of computation kernels over distributed topologies of processing elements. We demonstrate the portability of our approach by compiling our IR to a reconfigurable simulator of systolic arrays and chiplet style distributed hardware, as well as to distributed-memory CPU clusters. In the simulated reconfigurable setting, we use our compiler for hardware design space exploration in which link costs and latencies can be specified. In the distributed CPU setting, we show how to use recurrences and our scheduling language to express various matrix multiplication routines (Cannon, SUMMA, PUMMA, weight stationary) and solvers (Triangular solve and Cholesky). For matrix multiplication and the triangular solve, we generate distributed implementations competitive with ScaLAPACK.

Python advanced
By: Shiv Sundram, Akhilesh Balasingam, Nathan Zhang +2 more
Source: arXiv Nov 12, 2025
0.0
10 min read
0
Quality