fix: Panic in unicode boundary checking code #578

Merged
wetneb merged 2 commits from wetneb/string_boundary_panic into main 2025-08-31 09:05:34 +02:00 AGit
Owner

Since the full length of the string is a valid unicode boundary,
the range of lengths to check needs to end with the "len + 1" excluded
upper bound.

Since the full length of the string is a valid unicode boundary, the range of lengths to check needs to end with the "len + 1" excluded upper bound.
wetneb force-pushed wetneb/string_boundary_panic from 863e51d2ac to ebe7a7b21a 2025-08-30 23:25:11 +02:00 Compare
Author
Owner

Note that Rust's implementation is quite different, I wonder if it would make sense to just copy it?
https://doc.rust-lang.org/src/core/str/mod.rs.html#446

Note that Rust's implementation is quite different, I wonder if it would make sense to just copy it? https://doc.rust-lang.org/src/core/str/mod.rs.html#446
Owner

Note that Rust's implementation is quite different

Yeah, it does some shenanigans to avoid bounds checks and the like -- definitely the right decision for an std function, but I decided to keep it dead simple for our use case (yet still managed to make a mistake 🤦‍♀️)

I wonder if it would make sense to just copy it?

The feature that implements this function has already been stabilized, so I'd say we just wait for Rust 1.91.0..

> Note that Rust's implementation is quite different Yeah, it does some shenanigans to avoid bounds checks and the like -- definitely the right decision for an std function, but I decided to keep it dead simple for our use case (yet still managed to make a mistake 🤦‍♀️) > I wonder if it would make sense to just copy it? The feature that implements this function [has already been stabilized](https://github.com/rust-lang/rust/pull/145756), so I'd say we just wait for Rust 1.91.0..
ada4a approved these changes 2025-08-30 23:36:14 +02:00
ada4a left a comment
Owner

well that's embarrassing.. thank you for noticing this^^

well that's embarrassing.. thank you for noticing this^^
@ -35,3 +35,3 @@
len
} else {
(index..len)
(index..(len + 1))
Owner

nit: index..=len is imo a bit more readable

nit: `index..=len` is imo a bit more readable
@ -43,0 +43,4 @@
#[cfg(test)]
mod test {
use crate::StrExt;
Owner

nit: replacing this with use super::* is probably more future-proof..

nit: replacing this with `use super::*` is probably more future-proof..
@ -43,0 +47,4 @@
#[allow(unstable_name_collisions)]
#[test]
fn char_boundary_at_end() {
Owner

thank you for actually adding a test for this:)

thank you for actually adding a test for this:)
Integrate review feedback
All checks were successful
/ test (pull_request) Successful in 39s
dfb6c5a2d2
wetneb merged commit 17e3d128dd into main 2025-08-31 09:05:34 +02:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: mergiraf/mergiraf#578
No description provided.