<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: jmaargh</title>
    <description>The latest articles on Forem by jmaargh (@jmaargh).</description>
    <link>https://forem.com/jmaargh</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F273052%2F2bcb8aba-1b97-4b9e-9783-3205b50b1187.png</url>
      <title>Forem: jmaargh</title>
      <link>https://forem.com/jmaargh</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/jmaargh"/>
    <language>en</language>
    <item>
      <title>An alternative Any type?</title>
      <dc:creator>jmaargh</dc:creator>
      <pubDate>Mon, 16 Oct 2023 15:28:31 +0000</pubDate>
      <link>https://forem.com/jmaargh/an-alternative-any-type-41d2</link>
      <guid>https://forem.com/jmaargh/an-alternative-any-type-41d2</guid>
      <description>&lt;p&gt;Rust's &lt;a href="https://doc.rust-lang.org/stable/core/any/trait.Any.html"&gt;&lt;code&gt;Any&lt;/code&gt;&lt;/a&gt; type is pretty cool. You can use it to do  runtime type reflection, or downcasting, or dynamic typing, or other fun things. However, there are a couple of slightly annoying things about it:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;a href="https://doc.rust-lang.org/stable/core/any/struct.TypeId.html"&gt;&lt;code&gt;TypeId&lt;/code&gt;&lt;/a&gt; is currently 128 bits. This is because it's some hash of the concrete type, so needs to be long enough to reasonably avoid hash collisions.&lt;/li&gt;
&lt;li&gt;Getting &lt;code&gt;TypeId&lt;/code&gt; from &lt;code&gt;&amp;amp;dyn Any&lt;/code&gt; requires two dereferences: first you follow the vtable pointer to find the pointer to &lt;code&gt;Any::type_id()&lt;/code&gt;, then you call that function.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In the vast majority of cases this is totally fine (which is why the excellent libs team implemented it this way). You're unlikely to be bottlenecked on either of these. But neither is ideal: &lt;code&gt;u128&lt;/code&gt; operations can be pretty slow on older or embedded chips and nobody likes more indirections than are necessary.&lt;/p&gt;

&lt;p&gt;It occurs to me that both can be circumvented, if you're willing to give up one thing: stability of &lt;code&gt;TypeId&lt;/code&gt; values. That is, if you don't need to assume that &lt;code&gt;TypeId&lt;/code&gt;s are the same between different binaries. This seems to be a fairly small thing to give up in most cases. How often are people serialising &lt;code&gt;TypeId&lt;/code&gt;s? Doing so is already a bad idea as they're not guaranteed to be stable between Rust compiler releases.&lt;/p&gt;

&lt;p&gt;The idea is to simply store the type ID directly in the vtable and have the compiler guarantee that, in the context of the current build, the ID is unique. No second indirection, no IDs longer than necessary.&lt;/p&gt;

&lt;p&gt;Doing this "properly" would require some compiler hacking. But I did come up with a way it can be hacked around: I call it &lt;code&gt;PointerAny&lt;/code&gt; and &lt;code&gt;TypePointer&lt;/code&gt;. The trick is to use a pointer to a method of the &lt;code&gt;PointerAny&lt;/code&gt; trait &lt;strong&gt;as the type ID itself&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Let me explain. First, we define the trait&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;trait&lt;/span&gt; &lt;span class="n"&gt;PointerAny&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;'static&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;type_ptr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;TypePointer&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is exactly like &lt;code&gt;core::any::Any&lt;/code&gt;, no surprises here.&lt;/p&gt;

&lt;p&gt;We also need a &lt;code&gt;TypePointer&lt;/code&gt; instead of &lt;code&gt;TypeId&lt;/code&gt;. This will be the address of a function pointer (as discussed above), so let's do that:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="nd"&gt;#[derive(PartialEq)]&lt;/span&gt;
&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="nf"&gt;TypePointer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;usize&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For the sake of simplicity I'll just use a &lt;code&gt;usize&lt;/code&gt; here. Really you'd want &lt;code&gt;NonZeroUsize&lt;/code&gt; or something.&lt;/p&gt;

&lt;p&gt;Getting this &lt;code&gt;TypePointer&lt;/code&gt; statically is easy, we just take the address of the function pointer that's stored in the vtable:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;impl&lt;/span&gt; &lt;span class="n"&gt;TypePointer&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;PointerAny&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="nb"&gt;Sized&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;Self&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;Self&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;PointerAny&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;type_ptr&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But this isn't enough to be useful yet. We need a way of getting &lt;code&gt;TypePointer&lt;/code&gt; from a &lt;code&gt;&amp;amp;dyn PointerAny&lt;/code&gt;. In principle, I feel like there should be a good way of getting the compiler to tell us the address we're looking for. After all, the compiler knows how to &lt;em&gt;call&lt;/em&gt; this function, so it therefore knows how to find its address. Unfortunately &lt;strong&gt;I&lt;/strong&gt; don't know how to get the compiler to tell us that address, so instead I'm leaning on some very ugly unsafe code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;impl&lt;/span&gt; &lt;span class="n"&gt;TypePointer&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;object&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;dyn&lt;/span&gt; &lt;span class="n"&gt;PointerAny&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;Self&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;pointer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;unsafe&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;vtable&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="nb"&gt;usize&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;core&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;mem&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;transmute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;object&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="c1"&gt;// vtable consists of:&lt;/span&gt;
            &lt;span class="c1"&gt;// - drop pointer&lt;/span&gt;
            &lt;span class="c1"&gt;// - size&lt;/span&gt;
            &lt;span class="c1"&gt;// - alignment&lt;/span&gt;
            &lt;span class="c1"&gt;// - method pointers&lt;/span&gt;
            &lt;span class="c1"&gt;// In that order. So this gets us pointing to the first method.&lt;/span&gt;
            &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;method_pointer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vtable&lt;/span&gt;&lt;span class="nf"&gt;.add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="c1"&gt;// We want the pointer for this first method&lt;/span&gt;
            &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;method_pointer&lt;/span&gt;
        &lt;span class="p"&gt;};&lt;/span&gt;
        &lt;span class="k"&gt;Self&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pointer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This requires a little explanation. A wide-pointer like &lt;code&gt;&amp;amp;dyn PointerAny&lt;/code&gt; consists of a pointer to the type's data, followed by a pointer to the vtable. That's what the &lt;code&gt;transmute&lt;/code&gt; call is unpacking here.&lt;/p&gt;

&lt;p&gt;Rust, unfortunately for us, doesn't guarantee any particular layout for vtables. However, from what I can gather the current implementation is as outlined in the comment. First there's a function pointer to the drop implementation, then there are &lt;code&gt;usize&lt;/code&gt;s for both the size of the type and its alignment, then there are points to each method. Since we only have one method on &lt;code&gt;PointerAny&lt;/code&gt;, that pointer should be an offset of 3-&lt;code&gt;usize&lt;/code&gt;s from the base pointer. Which is what we take.&lt;/p&gt;

&lt;p&gt;Now you may have noticed that we haven't actually implemented &lt;code&gt;PointerAny&lt;/code&gt; yet. That's because we don't ever actually want to &lt;em&gt;call&lt;/em&gt; the &lt;code&gt;PointerAny::type_ptr&lt;/code&gt; method: we just want the compiler to give it a unique address per-type. Therefore, its implementation is the least important part of this puzzle (but still essential, as we need the compiler to actually generate it and its address). So we can just implement it in the obvious way:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;impl&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;'static&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="nb"&gt;Sized&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;PointerAny&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="cd"&gt;/// Be careful! If you have a `&amp;amp;dyn PointerAny`, then prefer calling&lt;/span&gt;
    &lt;span class="cd"&gt;/// `TypePointer::from` over this to avoid the extra indirection.&lt;/span&gt;
    &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;type_ptr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;TypePointer&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nn"&gt;TypePointer&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;of&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note, if you call this function from a &lt;code&gt;&amp;amp;dyn PointerAny&lt;/code&gt; then you lose the benefit of avoiding the indirection: prefer calling &lt;code&gt;TypePointer::from&lt;/code&gt; or &lt;code&gt;TypePointer::of&lt;/code&gt; directly.&lt;/p&gt;

&lt;p&gt;It's also interesting that &lt;code&gt;PointerAny::type_ptr&lt;/code&gt; is far nicer than &lt;code&gt;TypeId::from&lt;/code&gt;, despite doing the same thing, because at this point we already know the concrete type so can just get the function pointer directly.&lt;/p&gt;

&lt;p&gt;And that's it! We can now dynamically type-check just as with &lt;code&gt;core::any::Any&lt;/code&gt;!&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;is_same_type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;first&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;dyn&lt;/span&gt; &lt;span class="n"&gt;PointerAny&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;second&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;dyn&lt;/span&gt; &lt;span class="n"&gt;PointerAny&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nn"&gt;TypePointer&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;first&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nn"&gt;TypePointer&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;second&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="n"&gt;is_type&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;PointerAny&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;object&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;dyn&lt;/span&gt; &lt;span class="n"&gt;PointerAny&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nn"&gt;TypePointer&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;object&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nn"&gt;TypePointer&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;of&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://play.rust-lang.org/?version=stable&amp;amp;mode=debug&amp;amp;edition=2021&amp;amp;gist=4db6dde21d8869ae44b8cabecfd95e9e"&gt;Full code on playground&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;So we've successfully addressed the two "shortcomings" discussed above:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Our new &lt;code&gt;TypePointer&lt;/code&gt; is only a &lt;code&gt;usize&lt;/code&gt;, which is ideal for almost every architecture.&lt;/li&gt;
&lt;li&gt;We only do one pointer dereference in &lt;code&gt;TypePointer::from&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;We've also gained &lt;code&gt;TypePointer&lt;/code&gt; being non-zero, which allows niche optimisations for &lt;code&gt;Option&lt;/code&gt; etc. (if we'd used &lt;code&gt;NonNullUsize&lt;/code&gt;)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;On top of that we still have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;TypePointer::of&lt;/code&gt; is still a compile-time constant (no indirection)&lt;/li&gt;
&lt;li&gt;In principle this could all be done in a compile-time &lt;code&gt;const fn&lt;/code&gt;-compatible way (though you'd want to be really careful about the &lt;code&gt;const fn&lt;/code&gt; use of pointers - perhaps this isn't possible yet).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So what are the tradeoffs? What have we lost?&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Stability of &lt;code&gt;TypePointer&lt;/code&gt; values: if you recompile your program, even with the same compiler, these may change. Don't ever serialize these &lt;code&gt;TypePointer&lt;/code&gt;s: they're just pointers after all.&lt;/li&gt;
&lt;li&gt;Stability of implementation. I had to write some very ugly &lt;code&gt;unsafe&lt;/code&gt; code to get this to work, because I couldn't fine a stable way to get the compiler to tell me the address of a vtable method from a wide pointer. In principle this needn't be so ugly, but I just could not find a way of doing it without assuming the structure of the vtable.&lt;/li&gt;
&lt;li&gt;Correctness? The current implementation assumes that the compiler will generate exactly one version of &lt;code&gt;PointerAny::type_ptr&lt;/code&gt; for any given type (when needed). That is, there is a one-to-one correspondence between addresses of &lt;code&gt;PointerAny::type_ptr&lt;/code&gt; and types themselves. I'm not 100% sure this is a guarantee, but I've assumed it's true. It's &lt;a href="https://github.com/rust-lang/rust/issues/46139"&gt;known that&lt;/a&gt; Rust can generate multiple vtables for the same types - otherwise we could just use the vtable address itself and have zero indirections - but I've assumed that the pointers contained are stable.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;It's also interesting that we could have implemented &lt;code&gt;TypePoitner&lt;/code&gt; over &lt;code&gt;core::any::Any&lt;/code&gt; rather than defining a new &lt;code&gt;Any&lt;/code&gt; type. The only assumptions we need are that (a) the trait is implemented for every &lt;code&gt;'static&lt;/code&gt; type, (b) there are unique addresses for at least one method per type, and (c) we know how to find that address from a wide pointer. &lt;/p&gt;

&lt;p&gt;I'd love to hear what people think of this. There are probably some things here that are wrong (well, even more wrong than the &lt;code&gt;TypePointer::from&lt;/code&gt; implementation), so let me know!&lt;/p&gt;

&lt;p&gt;Discuss on &lt;a href="https://www.reddit.com/r/rust/comments/1798o4y/an_alternative_any_type/"&gt;reddit&lt;/a&gt;&lt;/p&gt;

</description>
      <category>rust</category>
      <category>programming</category>
    </item>
    <item>
      <title>Rust's `Send` and `Sync`, but actually the opposite</title>
      <dc:creator>jmaargh</dc:creator>
      <pubDate>Tue, 28 Mar 2023 15:33:46 +0000</pubDate>
      <link>https://forem.com/jmaargh/rusts-send-and-sync-but-actually-the-opposite-4pn0</link>
      <guid>https://forem.com/jmaargh/rusts-send-and-sync-but-actually-the-opposite-4pn0</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;This post is my personal notes for grokking &lt;code&gt;Send&lt;/code&gt; and &lt;code&gt;Sync&lt;/code&gt; in Rust. It's not formal, and will assume that you're basically familiar with concurrency and synchronisation, as well as Rust's main wrapper types. In particular, remember that Rust values are always owned by exactly one variable and taking references must satisfy &lt;a href="https://doc.rust-lang.org/stable/std/cell/index.html"&gt;&lt;strong&gt;aliasing&lt;/strong&gt; xor &lt;strong&gt;mutability&lt;/strong&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Here's the secret&lt;/strong&gt;: you shouldn't be worrying about &lt;code&gt;Send&lt;/code&gt; and &lt;code&gt;Sync&lt;/code&gt;. They're the default. Almost everything is &lt;code&gt;Send&lt;/code&gt; and &lt;code&gt;Sync&lt;/code&gt;, and the compiler will auto-derive them for every type it can. The issue is &lt;code&gt;!Send&lt;/code&gt; and &lt;code&gt;!Sync&lt;/code&gt;, or really just: &lt;code&gt;!Send&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  So what is &lt;code&gt;!Send&lt;/code&gt;?
&lt;/h2&gt;

&lt;p&gt;A type is &lt;code&gt;!Send&lt;/code&gt; when values can't be owned on one thread and then moved to another. Because of single-ownership it couldn't be owned by two threads simultaneously, this is a restriction across the whole life of the value.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;!Send&lt;/code&gt; := this value is locked to the thread that created it&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That's the core concept behind both &lt;code&gt;!Send&lt;/code&gt; and &lt;code&gt;!Sync&lt;/code&gt;. I'll get to when this is the case later, but first let's talk references.&lt;/p&gt;

&lt;p&gt;If we can have &lt;code&gt;T: !Send&lt;/code&gt;, we can also have &lt;code&gt;&amp;amp;U: !Send&lt;/code&gt; since &lt;code&gt;T&lt;/code&gt; could be &lt;code&gt;&amp;amp;U&lt;/code&gt;. This case is particularly interesting, since if we own a value of type &lt;code&gt;T&lt;/code&gt; we can create as many &lt;code&gt;&amp;amp;T&lt;/code&gt; values as we like.&lt;/p&gt;

&lt;p&gt;This means that unless &lt;code&gt;&amp;amp;T: !Send&lt;/code&gt;, we can have as many &lt;code&gt;&amp;amp;T&lt;/code&gt; values on as many threads as we like. This is great for the most part: &lt;code&gt;&amp;amp;T&lt;/code&gt; is immutable so there are no data-races... &lt;em&gt;unless&lt;/em&gt; &lt;code&gt;T&lt;/code&gt; contains &lt;a href="https://doc.rust-lang.org/stable/std/cell/index.html"&gt;interior mutability&lt;/a&gt;. Interior mutability exactly means being able to mutate &lt;code&gt;T&lt;/code&gt; behind a &lt;code&gt;&amp;amp;T&lt;/code&gt; reference. This sounds like a recipe for data races! In such cases we'll need &lt;code&gt;&amp;amp;T: !Send&lt;/code&gt; to prevent them. This is so important that it gets its own name...&lt;/p&gt;

&lt;h2&gt;
  
  
  Surprise &lt;code&gt;!Sync&lt;/code&gt;!
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;T: !Sync&lt;/code&gt; simply means &lt;code&gt;&amp;amp;T: !Send&lt;/code&gt;. Interpreting a bit, &lt;code&gt;!Sync&lt;/code&gt; means that a value cannot be referenced by multiple threads at all.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;!Sync&lt;/code&gt; := references to this value are locked to its thread&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This &lt;em&gt;almost&lt;/em&gt; means that &lt;code&gt;!Send&lt;/code&gt; implies &lt;code&gt;!Sync&lt;/code&gt;. After all, if a value cannot be used by more than one thread at different times, how could it possibly be allowed by &lt;em&gt;more than one&lt;/em&gt; thread at the same time? This is often true, but not a logical requirement, because &lt;code&gt;!Sync&lt;/code&gt; is about whether &lt;strong&gt;shared references&lt;/strong&gt; (&lt;code&gt;&amp;amp;T&lt;/code&gt;) can be used on multiple threads at the same time, not the value itself. It is possible (but fairly rare) for a type to be &lt;code&gt;!Send&lt;/code&gt; but still &lt;code&gt;Sync&lt;/code&gt;, for example if your type is backed by some thread-local resource but all behaviour visible through &lt;code&gt;&amp;amp;T&lt;/code&gt; does not depend on it.&lt;/p&gt;

&lt;h2&gt;
  
  
  So is this type &lt;code&gt;!Send&lt;/code&gt; or &lt;code&gt;!Sync&lt;/code&gt;?
&lt;/h2&gt;

&lt;p&gt;There are a bunch of rules of thumb. But I think the key question they boil down to is: could this type be used to move a &lt;code&gt;!Send&lt;/code&gt; value (which may be a &lt;code&gt;&amp;amp;T: !Send&lt;/code&gt;) to another thread?&lt;/p&gt;

&lt;p&gt;Rules of thumb for &lt;code&gt;!Sync&lt;/code&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Your type transitively contains any &lt;code&gt;!Sync&lt;/code&gt; type, unless wrapped by a &lt;a href="https://doc.rust-lang.org/stable/std/sync/struct.Mutex.html#impl-Sync-for-Mutex%3CT%3E"&gt;&lt;code&gt;Mutex&lt;/code&gt;&lt;/a&gt; or similar synchronisation primitive.&lt;/li&gt;
&lt;li&gt;Your type contains interior mutability which is not synchronised. For example, it contains &lt;a href="https://doc.rust-lang.org/stable/std/cell/struct.Cell.html"&gt;&lt;code&gt;Cell&lt;/code&gt;&lt;/a&gt; or &lt;a href="https://doc.rust-lang.org/stable/std/cell/struct.RefCell.html"&gt;&lt;code&gt;RefCell&lt;/code&gt;&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;If you can use a &lt;code&gt;&amp;amp;T&lt;/code&gt; to take ownership of any &lt;code&gt;!Send&lt;/code&gt; type.

&lt;ul&gt;
&lt;li&gt;This is normally the case if your type is &lt;code&gt;!Send&lt;/code&gt; itself.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Your type contains raw pointers and you haven't manually proven and implemented &lt;code&gt;Sync&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Rules of thumb for &lt;code&gt;!Send&lt;/code&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Your type transitively contains any &lt;code&gt;!Send&lt;/code&gt; type.&lt;/li&gt;
&lt;li&gt;Your type is a handle to a resource which it owns non-uniquely, and access to that resource is not synchronised.

&lt;ul&gt;
&lt;li&gt;For &lt;code&gt;&amp;amp;T&lt;/code&gt;, this is exactly rule 2 for &lt;code&gt;!Sync&lt;/code&gt;, since if &lt;code&gt;T&lt;/code&gt; has interior mutability that means that &lt;code&gt;&amp;amp;T&lt;/code&gt; is a shared-ownership handle to &lt;code&gt;T&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Your type contains raw pointers and you haven't manually proven and implemented &lt;code&gt;Send&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Examples
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;Rc&lt;/code&gt;&lt;/strong&gt; -- This is a handle to a resource that is jointly owned, therefore &lt;code&gt;!Send&lt;/code&gt; since (for example) &lt;code&gt;Rc::get_mut&lt;/code&gt; is not synchronised. Moreover, &lt;code&gt;Rc&lt;/code&gt; has interior mutability for the reference count, which is unsynchronised, so &lt;code&gt;!Sync&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;Arc&lt;/code&gt;&lt;/strong&gt; - Avoids the problems of &lt;code&gt;Rc&lt;/code&gt; by synchronising the reference count and access appropriately using atomics, thus both &lt;code&gt;Send&lt;/code&gt; and &lt;code&gt;Sync&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;RefCell&lt;/code&gt;&lt;/strong&gt; -- Archetypal example of interior mutability with no synchronisation, therefore &lt;code&gt;!Sync&lt;/code&gt;, however since the wrapped value is unqiuely owned then &lt;code&gt;RefCell&lt;/code&gt; is &lt;code&gt;Send&lt;/code&gt; when the wrapped value is.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;Mutex&lt;/code&gt;&lt;/strong&gt; -- If it contains a &lt;code&gt;!Send&lt;/code&gt; type then it's &lt;code&gt;!Send + !Sync&lt;/code&gt; since it provides full ownership of the contained type. Otherwise, it is both &lt;code&gt;Send&lt;/code&gt; by unique ownership of a &lt;code&gt;Send&lt;/code&gt;, and &lt;code&gt;Sync&lt;/code&gt; by enforcing synchronisation itself.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Raw pointers are interesting. Rust marks all raw pointer types as &lt;code&gt;!Send&lt;/code&gt; and &lt;code&gt;!Sync&lt;/code&gt;, but moving them (and their references) between threads isn't in-and-of-itself a problem. The problem comes when you try to &lt;em&gt;use&lt;/em&gt; (that is, dereference) that pointer. That action is already marked as &lt;code&gt;unsafe&lt;/code&gt;, so Rust could have allowed them to be &lt;code&gt;Send&lt;/code&gt; and &lt;code&gt;Sync&lt;/code&gt;, but it is considered so easy to break &lt;code&gt;Send&lt;/code&gt; and &lt;code&gt;Sync&lt;/code&gt; with raw pointers that you need to &lt;strong&gt;additionally&lt;/strong&gt; implement the corresponding &lt;code&gt;unsafe&lt;/code&gt; traits to mark your type as &lt;code&gt;Send&lt;/code&gt; or &lt;code&gt;Sync&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  No free &lt;code&gt;Send&lt;/code&gt; wrapping
&lt;/h2&gt;

&lt;p&gt;"I've got this annoying value, how do I just make the damn thing &lt;code&gt;Send&lt;/code&gt; and &lt;code&gt;Sync&lt;/code&gt; already!?" I hear you cry.&lt;/p&gt;

&lt;p&gt;Bad news, I'm afriad.&lt;/p&gt;

&lt;p&gt;The better news is that if the type is &lt;code&gt;!Sync&lt;/code&gt; but is &lt;code&gt;Send&lt;/code&gt;, then you can wrap it in a &lt;a href="https://doc.rust-lang.org/stable/std/sync/struct.Mutex.html#impl-Sync-for-Mutex%3CT%3E"&gt;&lt;code&gt;Mutex&lt;/code&gt;&lt;/a&gt; or similar synchronisation type and that will make it both &lt;code&gt;Send&lt;/code&gt; and &lt;code&gt;Sync&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The very bad news is that &lt;code&gt;!Send&lt;/code&gt; types can only be made &lt;code&gt;Send&lt;/code&gt; by &lt;code&gt;unsafe impl Send for T&lt;/code&gt; -- which you should absolutely not do unless you very much know what you're doing.&lt;/p&gt;

&lt;p&gt;Truly &lt;code&gt;!Send&lt;/code&gt; types (that is, basically anything &lt;code&gt;!Send&lt;/code&gt; except carefully used raw pointers) are stuck on their thread. This is the entire point of the feature, anything else and you're exposed to data races.&lt;/p&gt;

&lt;p&gt;Your alternatives for dealing with &lt;code&gt;!Send&lt;/code&gt; types is to, for example:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Serialise the data contained and send that to another thread where it can be re-constructed.&lt;/li&gt;
&lt;li&gt;Use channels or other inter-thread communication to indirectly "talk to" the &lt;code&gt;!Send&lt;/code&gt; thread when needed.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  When do I force &lt;code&gt;!Send&lt;/code&gt; or &lt;code&gt;!Sync&lt;/code&gt;?
&lt;/h2&gt;

&lt;p&gt;It's possible that you're writing some struct that would make no sense to send to other threads (or send references to other threads), but the compiler cannot work this out itself. This is rare, since the compiler will generally work it out before you, but possible if the issue is one of higher-level correctness that the compiler cannot reason about.&lt;/p&gt;

&lt;p&gt;For example, suppose you're wrapping some library behind FFI and you know (from the library docs) that the resource you're working with is thread-local. However, the "handle" that library gives you to said resource is just a bare primitive, like &lt;code&gt;u32&lt;/code&gt;. Rust has no idea that &lt;code&gt;u32&lt;/code&gt; is &lt;code&gt;!Send&lt;/code&gt; (acting more like a pointer) until you tell it.&lt;/p&gt;

&lt;p&gt;Right now, it's not terribly easy to force &lt;code&gt;!Send&lt;/code&gt; or &lt;code&gt;!Sync&lt;/code&gt;, since &lt;a href="https://doc.rust-lang.org/unstable-book/language-features/negative-impls.html"&gt;negative impls&lt;/a&gt; are only available on nightly. The work around is to use a &lt;a href="https://doc.rust-lang.org/std/marker/struct.PhantomData.html"&gt;&lt;code&gt;PhantomData&lt;/code&gt;&lt;/a&gt; of some type that already has the &lt;code&gt;!Send&lt;/code&gt; or &lt;code&gt;!Sync&lt;/code&gt; you require, so that gets inherited. For example, &lt;a href="https://docs.rs/winit/0.28.3/winit/event_loop/struct.EventLoop.html"&gt;&lt;code&gt;winit::EventLoop&lt;/code&gt;&lt;/a&gt; contains a &lt;code&gt;PhantomData&amp;lt;*mut ()&amp;gt;&lt;/code&gt; explicitly for this purpose.&lt;/p&gt;

&lt;h1&gt;
  
  
  When do I force &lt;code&gt;Send&lt;/code&gt; or &lt;code&gt;Sync&lt;/code&gt;?
&lt;/h1&gt;

&lt;p&gt;It is, of course, possible to manually implement &lt;code&gt;Send&lt;/code&gt; and &lt;code&gt;Sync&lt;/code&gt; on something the compiler has decided is &lt;code&gt;!Send&lt;/code&gt; and &lt;code&gt;!Sync&lt;/code&gt;. This is how &lt;code&gt;std&lt;/code&gt; collections (as well as others) which work on raw pointers implement &lt;code&gt;Send&lt;/code&gt; and &lt;code&gt;Sync&lt;/code&gt; appropriately.&lt;/p&gt;

&lt;p&gt;This power -- like any use of &lt;code&gt;unsafe&lt;/code&gt; -- should absolutely not be taken lightly. Read &lt;a href="https://doc.rust-lang.org/nomicon/send-and-sync.html"&gt;the nomicon&lt;/a&gt;, reason carefully, and write good tests. Don't just &lt;code&gt;unsafe impl Send&lt;/code&gt; because you're frustrated, that way lies Undefined Behaviour and Madness.&lt;/p&gt;

</description>
      <category>rust</category>
    </item>
  </channel>
</rss>
