Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Ref::map_val allows pointer leak #5

Open
stepancheg opened this issue May 13, 2019 · 35 comments
Open

Ref::map_val allows pointer leak #5

stepancheg opened this issue May 13, 2019 · 35 comments

Comments

@stepancheg
Copy link
Contributor

stepancheg commented May 13, 2019

#[test]
fn should_fail_but_it_passes() {
    let s = RefCell::new(String::new());
    let x: RefVal<&String> = Ref::map_val(RefCell::borrow(&s), |s| s);
    let y: &String = &*x;
    drop(x);

    // assert not borrowed
    RefCell::borrow_mut(&s);

    // but pointer to data is still accessible
    y.len();
}

Don't know yet how to fix it.

@d-e-s-o
Copy link
Owner

d-e-s-o commented May 13, 2019

Can you quickly explain what should fail in your opinion, why and how?

@stepancheg
Copy link
Contributor Author

stepancheg commented May 13, 2019

I don't know what should fail exactly. Somehow y should be invalid after drop(x).

If it's not clear what's wrong with current behavior, there's another example:

    let s = RefCell::new("abcd".to_owned());
    let x: RefVal<&String> = Ref::map_val(RefCell::borrow(&s), |s| s);
    let y: &String = &*x;
    let yy = &y[..];
    drop(x);

    // clear the content
    RefCell::borrow_mut(&s).clear();

    eprintln!("<{}>", y);
    eprintln!("<{}>", yy);

This prints:

<>
<abcd>

It is "use after free": RefCell content is cleared, but it is still "accessible" using yy variable.

@stepancheg
Copy link
Contributor Author

Or even more clear example:

    enum Zzz {
        I(u32),
        S(String),
    }

    let s = RefCell::new(Zzz::S("abcd".to_owned()));
    let x: RefVal<&String> = Ref::map_val(RefCell::borrow(&s), |s| match s {
        Zzz::I(_) => unreachable!(),
        Zzz::S(ref s) => s,
    });
    let y: &String = &*x;
    drop(x);

    // clear the content
    *RefCell::borrow_mut(&s) = Zzz::I(1);

    println!("{}", *y);

This code results in SIGSEGV.

stepancheg pushed a commit to stepancheg/cell that referenced this issue May 13, 2019
escape_debug() is roughly the Rust equivalent of python's repr on str.

Fixes d-e-s-o#5.
@d-e-s-o
Copy link
Owner

d-e-s-o commented May 13, 2019

Thanks for the examples! I hope to be able to look at this problem more closely by the end of the week.

@d-e-s-o
Copy link
Owner

d-e-s-o commented May 18, 2019

It appears that the lifetimes are messed up for RefVal. The one difference we have over RefCell::map is the lifetime 'b for F in map_val

    pub fn map_val<U: Sized, F>(orig: Ref<'b, T>, f: F) -> RefVal<'b, U>
        where F: FnOnce(&'b T) -> U
    // vs.
    pub fn map<U: ?Sized, F>(orig: Ref<'b, T>, f: F) -> Ref<'b, U>
        where F: FnOnce(&T) -> &U

So our RefVal lives for 'b (which is correct I'd say), but it seems as if the deref loses information about this fact. I am not sure how as bug like this is even possible in safe Rust.

This example illustrates that things work as expected when the RefCell itself leaves the scope:

#[test]
fn should_fail_but_it_passes() {
    use std::ops::Deref;

    let y: &String = {
      let s = RefCell::new("hallo".into());
      let b = RefCell::borrow(&s);
      let x: RefVal<&String> = Ref::map_val(b, |s| s);
      x.deref()
    };

    assert_eq!(y.as_str(), "hallo");
}
369 |     let y: &String = {
    |         - borrow later stored here
370 |       let s = RefCell::new("hallo".into());
371 |       let b = RefCell::borrow(&s);
    |                               ^^ borrowed value does not live long enough
...
374 |     };
    |     - `s` dropped here while still borrowed

@d-e-s-o
Copy link
Owner

d-e-s-o commented May 18, 2019

Interestingly, Ref seemingly has the exact same problem:

#[test]
fn should_fail_but_it_passes() {
    use std::ops::Deref;

    let st = "hallo".to_string();
    let s = RefCell::new(&st);
    let y: &String = {
      let b = RefCell::borrow(&s);
      let x: Ref<&String> = Ref::map(b, |s| s);
      x.deref()
    };

    // assert not borrowed
    RefCell::borrow_mut(&s);

    // but pointer to data is still accessible
    assert_eq!(y.as_str(), "hallo");
}

This example just works. But should it?

Which gets me back to the question why we think it is a problem. RefCell contains a reference. Multiple references to an object can exist as long as they don't allow for mutation of the pointee.

I also don't think that this is really a use-after-free. As I established in the previous example, if the variable being referenced is dropped the code won't compile. It's only the borrow that is "freed".

@stepancheg
Copy link
Contributor Author

stepancheg commented May 18, 2019

removed

@stepancheg
Copy link
Contributor Author

No wait.

@stepancheg
Copy link
Contributor Author

stepancheg commented May 18, 2019

There's nothing wrong with RefCell.

RefCell::new() in your last example holds &String, not String.

So when calling RefCell::borrow_mut(&s); it invalidates pointers to &String, not pointers to String (invalidates &&String, not &String).

@d-e-s-o
Copy link
Owner

d-e-s-o commented May 18, 2019

Hm yeah, I noticed that too after experimenting a little more. Sigh

@stepancheg
Copy link
Contributor Author

I stopped understand anything:

struct Foo<T> {
    t: T,
}

impl<T> Foo<T> {
    fn deref2(&self) -> &T {
        &self.t
    }
}

fn foo(f: Foo<&String>) {
    let p: &String = f.deref2();
}

fn main() {
    println!("$");
}

This code compiles. But should it? This line:

    let p: &String = f.deref2();

f.deref2() should return &&String, not &String.

@stepancheg
Copy link
Contributor Author

Moreover, both these lines are correct:

    let p: &String = f.deref2();
    let p: &&String = f.deref2();

@stepancheg
Copy link
Contributor Author

OK, seems like Rust does auto-dereferencing.

Since when?

    let s = "abc".to_owned();
    let p: &String = &s;
    let pp: &&String = &p;
    let p1: &String = pp;

@stepancheg
Copy link
Contributor Author

OK, it is known. users

@stepancheg
Copy link
Contributor Author

stepancheg commented May 18, 2019

OK, new minimal test case to reproduce the problem:

struct Foo<T> {
    t: T,
}

impl<T> Foo<T> {
    fn deref2(&self) -> &T {
        &self.t
    }
}

fn foo(f: Foo<&String>) {
    let p1: &String = f.deref2();
    drop(f);
    println!("{}", p1);
}

@d-e-s-o
Copy link
Owner

d-e-s-o commented May 18, 2019

OK, new minimal test case to reproduce the problem:

struct Foo<T> {
    t: T,
}

impl<T> Foo<T> {
    fn deref2(&self) -> &T {
        &self.t
    }
}

fn foo(f: Foo<&String>) {
    let p1: &String = f.deref2();
    drop(f);
    println!("{}", p1);
}

Hm, but isn't that the same problem I hit? Foo stores a reference because T=&String, no? So ultimately your deref2 pulls out a reference to an object that still exists.

@d-e-s-o
Copy link
Owner

d-e-s-o commented May 18, 2019

(I am inferring that based on the function signature: fn foo(f: Foo<&String>))

@stepancheg
Copy link
Contributor Author

So ultimately your deref2 pulls out a reference to an object that still exists.

Yes, you are right.

Thinking...

@d-e-s-o
Copy link
Owner

d-e-s-o commented May 18, 2019

What I find very interesting is that when I make RefVal use BorrowRefMut instead of BorrowRef, I seemingly get properly checked lifetimes and a compile error:

--- src/cell.rs
+++ src/cell.rs
@@ -579,19 +579,19 @@ impl<'b, T: ?Sized> Ref<'b, T> {
         }
     }

-    /// Make a new `RefVal` from the borrowed data.
-    ///
-    /// The `RefCell` is already immutably borrowed, so this operation
-    /// cannot fail.
-    #[inline]
-    pub fn map_val<U: Sized, F>(orig: Ref<'b, T>, f: F) -> RefVal<'b, U>
-        where F: FnOnce(&'b T) -> U
-    {
-        RefVal {
-            value: f(orig.value),
-            borrow: orig.borrow,
-        }
-    }
+    ///// Make a new `RefVal` from the borrowed data.
+    /////
+    ///// The `RefCell` is already immutably borrowed, so this operation
+    ///// cannot fail.
+    //#[inline]
+    //pub fn map_val<U: Sized, F>(orig: Ref<'b, T>, f: F) -> RefVal<'b, U>
+    //    where F: FnOnce(&'b T) -> U
+    //{
+    //    RefVal {
+    //        value: f(orig.value),
+    //        borrow: orig.borrow,
+    //    }
+    //}
 }

 impl<T: ?Sized + fmt::Display> fmt::Display for Ref<'_, T> {
@@ -641,10 +641,10 @@ impl<'b, T: ?Sized> RefMut<'b, T> {
     /// The `RefCell` is already immutably borrowed, so this operation
     /// cannot fail.
     #[inline]
-    pub fn map_val<U: Sized, F>(orig: RefMut<'b, T>, f: F) -> RefValMut<'b, U>
+    pub fn map_val<U: Sized, F>(orig: RefMut<'b, T>, f: F) -> RefVal<'b, U>
         where F: FnOnce(&'b mut T) -> U
     {
-        RefValMut {
+        RefVal {
             value: f(orig.value),
             borrow: orig.borrow,
         }
@@ -733,7 +733,7 @@ impl<T: ?Sized + fmt::Display> fmt::Display for RefMut<'_, T> {
 /// See the [module-level documentation](index.html) for more.
 pub struct RefVal<'b, T> {
     value: T,
-    borrow: BorrowRef<'b>,
+    borrow: BorrowRefMut<'b>,
 }

 impl<'b, T> RefVal<'b, T> {
#[test]
fn should_fail_but_it_passes() {
    use std::ops::Deref;
    use std::ops::DerefMut;

    let s = RefCell::new("hallo".into());
    let y: &String = {
      let b = RefCell::borrow_mut(&s);
      let x: RefVal<&mut String> = RefMut::map_val(b, |s| s);
      x.deref()
    };

    let mut z = RefCell::borrow_mut(&s);
    *z = "hihi".to_string();

    assert_eq!(y.as_str(), "hallo");
}
error[E0597]: `x` does not live long enough
   --> tests/cell.rs:485:7
    |
482 |     let y: &String = {
    |         - borrow later stored here
...
485 |       x.deref()
    |       ^ borrowed value does not live long enough
486 |     };
    |     - `x` dropped here while still borrowed

But then I checked the definitions of both structs and couldn't really find something that explains that difference to me.

@d-e-s-o
Copy link
Owner

d-e-s-o commented May 18, 2019

(I came to that because for RefValMut the behavior seems to be correct; somehow this suggests to me that it has to do with the multiple-immutable vs. only one mutable borrow rules, but I can't point my finger to it)

@d-e-s-o
Copy link
Owner

d-e-s-o commented May 18, 2019

Yeah, it definitely has something to do with that.

This example correctly fails to compile (everything is mut):

#[test]
fn should_fail_but_it_passes() {
    use std::ops::Deref;
    use std::ops::DerefMut;

    let s = RefCell::new("hallo".into());
    let y: &String = {
      let b = RefCell::borrow_mut(&s);
      let x: RefValMut<&mut String> = RefMut::map_val(b, |s| s);
      x.deref()
    };

    let mut z = RefCell::borrow_mut(&s);
    *z = "hihi".to_string();

    assert_eq!(y.as_str(), "hallo");
}

Now let's make the following change:

--- tests/cell.rs
+++ tests/cell.rs
@@ -481,7 +481,7 @@ fn should_fail_but_it_passes() {
     let s = RefCell::new("hallo".into());
     let y: &String = {
       let b = RefCell::borrow_mut(&s);
-      let x: RefValMut<&mut String> = RefMut::map_val(b, |s| s);
+      let x: RefValMut<&String> = RefMut::map_val(b, |s| s as &String);
       x.deref()
     };

And now it compiles.

@stepancheg
Copy link
Contributor Author

So if we change in first example everything from mutable to immutables, it starts to compile.

Looks suspicious.

@stepancheg
Copy link
Contributor Author

So, the minimal example would be:

Compiles fine with &:

struct Foo<T> {
    t: T,
}

impl<T> Foo<T> {
    fn deref2(&self) -> &T {
        &self.t
    }
}

fn foo(f: Foo<&String>) {
    let p1: &String = *f.deref2();
    drop(f);
    println!("{}", p1);
}

But fails to compile with &mut:

struct Foo<T> {
    t: T,
}

impl<T> Foo<T> {
    fn deref2(&self) -> &T {
        &self.t
    }
}

fn foo(f: Foo<&mut String>) {
    let p1: &mut String = *f.deref2();
    drop(f);
    println!("{}", p1);
}
error[E0596]: cannot borrow data in a `&` reference as mutable
   --> src/main.rs:122:27
    |
122 |     let p1: &mut String = *f.deref2();
    |                           ^^^^^^^^^^^ cannot borrow as mutable

error[E0505]: cannot move out of `f` because it is borrowed
   --> src/main.rs:123:10
    |
122 |     let p1: &mut String = *f.deref2();
    |                            - borrow of `f` occurs here
123 |     drop(f);
    |          ^ move out of `f` occurs here
124 |     println!("{}", p1);
    |                    -- borrow later used here

@stepancheg
Copy link
Contributor Author

So looks like there's special handling of &mut somewhere in the compiler (to guarantee it is unique) which prevents the same issue in RefValMut.

@d-e-s-o
Copy link
Owner

d-e-s-o commented May 18, 2019

So looks like there's special handling of &mut somewhere in the compiler (to guarantee it is unique) which prevents the same issue in RefValMut.

I guess it's something like that. I am not sure your minimal example actually illustrates that very case, though.

But then I think my brain is fried. Will see if I have a clear thought tomorrow.

@stepancheg
Copy link
Contributor Author

The problem, how I understand it (again).

Basically deref signature of Ref is this:

impl<T> Ref<'_, T> {
    fn deref<'r, 's>(&'s self) -> &'r T where 's: 'r { ... }
}

Which means that lifetime of the returned pointer ('r) must not outlive the lifetime of self ('s).

Translating this example to RefVal we should get:

impl RefVal<'_, T> {
    fn deref<'s>(&'s self) -> &'s T where 's: T { ... }
}

Which means that that lifetime of T must not outlive the lifetime of self ('s).

But this is not valid Rust syntax.

@d-e-s-o
Copy link
Owner

d-e-s-o commented May 22, 2019

Yep. We are somehow missing the connection from the anonymous RefVal lifetime to that of deref's return value. However, I still can't wrap my head around how that can cause an issue like this -- because it's still safe Rust :-|

What I would do next (but haven't found the time to) is to create a truly minimal example showing the same unsafety on nothing more than a combination of a struct, a map, and a deref (or whatever we need) -- decoupled from the existing Cell stuff. If we can come up with that we should be able to file an issue. And if not perhaps that makes the problem on our side more clear.

@stepancheg
Copy link
Contributor Author

I tried to reproduce it with safe code. I'm certain it is impossible.

Because it's not possible to implement borrow_mut with &self parameter and safe rust.

Rust is safe. Just not flexible enough.

@d-e-s-o
Copy link
Owner

d-e-s-o commented May 23, 2019

Not sure I am following. What are you suggesting is happening here? I am not using unsafe Rust and yet we have an memory unsafety issue, don't we (otherwise how can we segfault)? So that leaves three possibilities from my point of view:

  1. Rust has an unsafety issue
  2. One of the Cell structs has an unsafety issue (they use unsafe internally)
  3. I am using some of the Cell objects in a wrong way, violating some form of contract, resulting in unsafety

I agree that 1) is unlikely.

Because it's not possible to implement borrow_mut with &self parameter and safe rust.

Okay, but internal mutability is part of the language (by language I am including the standard library).

@stepancheg
Copy link
Contributor Author

I am using some of the Cell objects in a wrong way, violating some form of contract, resulting in unsafety

Yes, I think this is it.

@stepancheg
Copy link
Contributor Author

BTW, until Rust is changed to support this use case, this sad and incomplete workaround is possible:

impl<T> RefVal<'_, T> {
    /// Sad function
    pub fn get_static<U,F>(&mut self, f: F) -> U
        where
            U: 'static,
            F: FnOnce(&mut T) -> U + 'static
    {
        f(&mut self.value)
    }
}

@d-e-s-o
Copy link
Owner

d-e-s-o commented Jun 15, 2019

"Minimal" test harness: cell-issue-5.tar.gz

@d-e-s-o
Copy link
Owner

d-e-s-o commented Jun 16, 2019

FYI, I've posed this "challenge" to a wider audience in the Rust user forum and there were a few responses, most notably https://users.rust-lang.org/t/unsafety-with-advanced-version-of-refcell/29323/4

I haven't yet fully gone through it.

@hiddenhare
Copy link

Hi folks,

I spent a few hours today trying to make this work, as it seemed like it might be useful for a project of mine. I figured I might as well report my findings.

The reason why the current implementation is unsafe is the choice of lifetime in map_val.

The signature is currently:

pub fn map_val<U: Sized, F>(orig: Ref<'b, T>, f: F) -> RefVal<'b, U> where
    F: FnOnce(&'b T) -> U

'b is the lifetime of the RefCell which is being borrowed, and the true lifetime of the T which is being stored. However, it's always unsafe to create a &'b T; you're essentially giving your users direct access to the RefCell's interior, which defeats the point of having a RefCell in the first place! If somebody has a &'b T then they can drop the Ref<'b, T> from which it originated, mutably borrow the RefCell itself to create a RefMut<'b, T>, and dereference that to get a simultaneous &'a mut T.

We want to replace &'b T with &'a T, where 'a is some lifetime which can't outlive the RefVal which we're constructing. Rust does most of the work for us if we just leave the FnOnce(&T) without a lifetime annotation. It says "this function's first parameter has an anonymous lifetime, and so when it's accessed by the function body, it's not possible to derive a value from it which will outlive that anonymous lifetime".

This is exactly how Ref::map works:

pub fn map<U: ?Sized, F>(orig: Ref<'b, T>, f: F) -> Ref<'b, U> where
    F: FnOnce(&T) -> &U

We can try to apply the same principle to map_val:

pub fn map_val<U: Sized, F>(orig: Ref<'b, T>, f: F) -> RefVal<'b, U> where
    F: FnOnce(&T) -> U

Unfortunately, this (correctly!) prevents map_val calls from compiling. The problem is that U is borrowing data from an anonymous function argument which, as far as rustc knows, could have been dropped as soon as the closure returned, so rustc won't let us store a value of type U higher up the stack frame.

I think it will only be possible to make this work the way you've envisaged it once we have higher-kinded type parameters. Something like this:

pub fn map_val<U<'a>: Sized, F>(orig: Ref<'b, T>, f: F) -> RefVal<'b, U> where
    F: for<'a> FnOnce(&'a T) -> U<'a>;

impl<'b, U<'a>> Deref for RefVal<'b, U> {
    type Target<'a> = U<'a>;
    fn deref<'a>(&'a self) -> &'a U<'a>;
}

Unfortunately this kind of syntax is far away on the horizon. Even GATs wouldn't be enough to make it work; we'd need full-fledged higher-kinded type parameters.

Sorry for the bad news! If somebody wants to prove me wrong then I'm all ears.

cc rust-lang/rust#54776

@d-e-s-o
Copy link
Owner

d-e-s-o commented Aug 13, 2019

Many thanks for the analysis! I'll see if it makes sense to point out this unsafety in the README or some place visible.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants