catatp.fm Unofficial Accidental Tech Podcast transcripts (generated by computer, so expect errors).

175: Dance Palace

More than you ever wanted to know about filenames.

Episode Description:

Sponsored by:

  • Fracture: Photos printed in vivid color directly on glass. Use promo code ATP10 for 10% off your first order.
  • Hover: The best way to buy and manage domain names. Use coupon code APFS for 10% off your first purchase.
  • Backblaze: Online backup for $5/month. Native. Unlimited. Unthrottled. Uncomplicated.

MP3 Header

Transcribed using Whisper large_v2 (transcription) + WAV2VEC2_ASR_LARGE_LV60K_960H (alignment) + Pyannote (speaker diaritization).

Chapters

  1. Parallelizing LAME
  2. Sierra Mac Pro support
  3. Info from The Talk Show Live
  4. Conquesting in search ads
  5. Apple Photos keyword searching
  6. Sponsor: Fracture (code ATP10)
  7. Thunderbolt Display discontinued
  8. Sponsor: Hover (code APFS)
  9. APFS: Schedule and migration
  10. APFS: Naming
  11. APFS: Unicode filenames
  12. Sponsor: Backblaze
  13. APFS: Unicode filenames, cont’d.
  14. APFS: (Lack of) data integrity
  15. Ending theme
  16. Post-show: Email and parenting

Parallelizing LAME

⏹️ ▶️ John I was always just thinking this file system would save me. I just thought like, well, I’ll just, I’ll stick it out

⏹️ ▶️ John and there’ll be a new file system here probably next year. And I’ve been thinking that for a lot of years now.

⏹️ ▶️ Marco Well, I spent this morning trying to parallelize the lame MP3 encoder. Why?

⏹️ ▶️ Marco Why not?

⏹️ ▶️ Marco, Casey Because it’s a

⏹️ ▶️ Marco very hard problem that hasn’t been solved since like the early 2000s. And I often

⏹️ ▶️ Marco need to encode our show MP3, which takes probably, I don’t know, 90

⏹️ ▶️ Marco seconds to encode. It’s a surprisingly long time for a modern computing task

⏹️ ▶️ Marco like that, and that annoys me. And so I figured, why don’t I spend the afternoon seeing if I

⏹️ ▶️ Marco can parallelize the MP3 encoder? And there are faster

⏹️ ▶️ Marco encoders than the LAME encoder. And I know this is hard for me to talk about because I know

⏹️ ▶️ Marco that the word LAME is really not a nice word to say. But

⏹️ ▶️ Marco it’s kind of like the like the GIMP photo editor, like, also not a nice word to say. And for whatever

⏹️ ▶️ Marco reason, these open source projects named themselves these acronyms without regard to what those words mean and

⏹️ ▶️ Marco how they hurt people. So I’m sorry in advance, but it’s called the lame mp3 encoder. And it is

⏹️ ▶️ Marco I have tested many different encoding options for podcast audio from mp3 AC, the

⏹️ ▶️ Marco he AC all like the the mp3 Pro, like all like the add on like spectral

⏹️ ▶️ Marco things to these formats and simple joint stereo with the Lame MP3

⏹️ ▶️ Marco encoder is by far the the highest quality and at the best bit rates that I have that I have found.

⏹️ ▶️ Marco It’s really it’s really quite good. So problem is MP3 encoding is very difficult to

⏹️ ▶️ Marco parallelize at least while also doing it well and so that’s what I’m kind of trying to do

⏹️ ▶️ Marco here. Just for

⏹️ ▶️ Casey ATP?

⏹️ ▶️ Marco Well I mean ideally I would you know ship this to other people as well you know maybe open source it or maybe just

⏹️ ▶️ Marco you know embedded into my pre-production tool that I have to at some point hopefully maybe release

⏹️ ▶️ Marco so that it could it could be very helpful because one of the problems is like we’ve basically hit a wall of single-threaded

⏹️ ▶️ Marco performance on modern computers like we’re not really getting much better at that and that that’s

⏹️ ▶️ Marco like one of the only tasks that I do on a regular basis on my computer where I’m just like dying for more

⏹️ ▶️ Marco gigahertz on my on one core so if I can eliminate that that that makes me happier with

⏹️ ▶️ Marco my computers and it makes me happier with potential future iMacs and Mac Pros

⏹️ ▶️ Marco and Mac Books and things because I can encode our show four or eight times faster and

⏹️ ▶️ Marco fast enough that it doesn’t matter anymore.

⏹️ ▶️ Casey That makes sense. Well good luck. I mean I’d love to see that happen because I think

⏹️ ▶️ Casey it would make a lot of people very happy but god that sounds like royal pain in the hindquarters.

⏹️ ▶️ Marco It really is. The one saving grace of this is that But normally, if you’re

⏹️ ▶️ Marco dealing with a non-trivially sized open source project, or heck, even a trivially sized

⏹️ ▶️ Marco one, usually it’s just dependency hell. And usually it is so hard to

⏹️ ▶️ Marco get to the point where you can even build the thing, let alone trying to make changes

⏹️ ▶️ Marco and try to test things. It is such a pain because they all depend on these weird, crazy

⏹️ ▶️ Marco libraries that you don’t have or that are hard to install on Mac OS X or something.

⏹️ ▶️ Marco It’s such a pain usually. LAME is actually really simple. There’s not much to

⏹️ ▶️ Marco it. It’s not very many files and there’s, as far as I know, there’s no

⏹️ ▶️ Marco external dependencies except maybe like you know basic libc stuff that everything has. So it’s pretty,

⏹️ ▶️ Marco it’s really actually quite pleasant to work on. So kudos to the LAME project.

⏹️ ▶️ Casey Is this all C or C++?

⏹️ ▶️ Marco It’s all C, which is also nice because I don’t really know a lot of C++. So to have something to be pure

⏹️ ▶️ Marco C is also kind of a welcome change.

⏹️ ▶️ Casey It’s like back a few jobs ago, back when you actually had a job.

⏹️ ▶️ Marco Yeah, exactly. I still like C a lot. I recognize that

⏹️ ▶️ Marco working in C nowadays, it’s like having a classic car that you work on in your garage.

⏹️ ▶️ Marco It’s really fulfilling and it’s really nice and cool if that’s

⏹️ ▶️ Marco your personality type, But I really would be hard-pressed to justify

⏹️ ▶️ Marco using it for, like, quote, real work anymore, but I really do enjoy working with it

⏹️ ▶️ Marco on the occasion that I need to work on it.

⏹️ ▶️ Casey Fair enough. I mean anything’s better than Pearl or PHP, am I right?

Sierra Mac Pro support

⏹️ ▶️ Casey So anyway, we should do some follow up. The Sierra Mac Pro support. This is not good

⏹️ ▶️ Casey times for John Syracuse.

⏹️ ▶️ John It’s not so bad times either. Marcos theory was that it was because of the instructions.

⏹️ ▶️ Marco Yeah, my theory was the hard hardware, like encryption instructions that Intel has added over time, but turns

⏹️ ▶️ Marco out, nope.

⏹️ ▶️ John Well, I mean, we didn’t get any official word on it, but talking to a whole bunch of people, hey, why do you think they drop support for this

⏹️ ▶️ John computer or that computer? The general consensus was that they just drop support for computers that are

⏹️ ▶️ John old because it’s a pain in the butt to keep supporting them. And there’s lots of things that are a pain to keep supporting.

⏹️ ▶️ John If you do anything to the OS that changes sort of the driver model or that requires drivers to be

⏹️ ▶️ John even just recompiled or changed in some subtle way, supporting old hardware like, say,

⏹️ ▶️ John the airport card that could be put into my 2008 Mac Pro. That’s kind of a pain

⏹️ ▶️ John and you don’t want to bother with that. But the big one, I think, is video drivers, because there is usually some

⏹️ ▶️ John participation from the video card vendor, whether it be AMD, ATI, or NVIDIA,

⏹️ ▶️ John in the creation of the drivers for OS X for these various GPUs.

⏹️ ▶️ John And Apple doesn’t want to keep doing that, and usually if you were to go to AMD and say, hey, we’re making a

⏹️ ▶️ John new version of the operating system, can you help us, or either help us to, or do it entirely

⏹️ ▶️ John on your own, drivers that work with the Sierra kernel

⏹️ ▶️ John and this ancient 2008 Mac Pro, I don’t think they’re keen to do that. But anyway, lots

⏹️ ▶️ John of reasons why you might want to support old hardware And yeah, no no

⏹️ ▶️ John specific smoking gun for any particular hardware feature That is not supported

⏹️ ▶️ John and adding adding to this pile of Support for the idea that it’s just because it’s old

⏹️ ▶️ John is that 2009 Mac pros are also not supported contrary to what I had hoped On the last show

⏹️ ▶️ John so that means my work Mac can’t run Sierra either there. What are you gonna do? I’m gonna wait I’m

⏹️ ▶️ Marco gonna wait until the new… It’s almost it’s almost this you have to buy a new computer.

⏹️ ▶️ John Oh yeah everyone gets laptops at work now so I’m gonna wait until the new MacBook Pros come out and then

⏹️ ▶️ John I’m going to wait another month and then I’m going to get a new computer at work.

⏹️ ▶️ Marco Now you you do realize that the new MacBook Pros will almost certainly have the arrow key layout that you

⏹️ ▶️ Marco hate.

⏹️ ▶️ John I’m not gonna use that keyboard. I mean it’s just gonna be sitting off

⏹️ ▶️ John to the side with a screen that I never look at connected to an external screen like I’m just using it as

⏹️ ▶️ John a weird shaped Mac Mini that’s my plan because I only have one screen at work and it’s not

⏹️ ▶️ John a great screen but whatever it’s fine so I’ll keep doing that I’m probably I want to get the 15 inch so I can put it in mirrored

⏹️ ▶️ John display mode that’s how much I’m not gonna use a second screen just so my windows and crap won’t move when I disconnect

⏹️ ▶️ John the monitor it’ll be the same res on both you know what I mean that’s That’s my current plan.

⏹️ ▶️ Marco I love that your current plan clearly didn’t even consider another Mac Pro.

⏹️ ▶️ John Well, I can’t. Work would never buy that for me. My Mac was the first Mac at the entire

⏹️ ▶️ John company, and I had to get special dispensation to get it, and it was way more expensive than the $400

⏹️ ▶️ John piece of crap Dell laptops they were getting everybody back then. Mine was the cheapest possible Mac Pro

⏹️ ▶️ John you could get in the minimum configuration, and It was like $1,500 and it took so much

⏹️ ▶️ John for them to choke that down Nowadays, they’re better about buying Macs if you want one and they’ll buy,

⏹️ ▶️ John you know 13 inch or 15 inch MacBook Pro and that’s what I plan on getting but no

⏹️ ▶️ John Mac Pro is not an option Either is an iMac for that matter.

⏹️ ▶️ Casey Would you get an iMac over a MacBook Pro if you have the choice?

⏹️ ▶️ Marco Yes, I would is the is the base model 5k iMac not price competitive with the 15 inch

⏹️ ▶️ Marco MacBook Pro

⏹️ ▶️ John I don’t think it’s price. I think it’s You don’t understand big corporations. They just want to do things

⏹️ ▶️ John the way they want to do them. And they want regularity and uniformity. And they don’t want you to be a special snowflake

⏹️ ▶️ John who wants a specific computer. They only have and support and know and understand two kinds of computers

⏹️ ▶️ John on the Mac side. And it’s whatever. I don’t know. I mean, maybe I’ll ask about it, but.

⏹️ ▶️ Casey No, this is exactly my experience. Like my company gets leased Macs. And I think

⏹️ ▶️ Casey they’re two year leases. And basically your choices are MacBook Pro. if you do some sort of development

⏹️ ▶️ Casey work. I believe a MacBook Air if you don’t, or some sort of god awful Dell. And

⏹️ ▶️ Casey those are, meh, actually might be an HP, some sort of god awful PC, but it’s one of those three.

⏹️ ▶️ Casey And if you want anything else under the sun, well, tough noogies, because the help desk does not want to support

⏹️ ▶️ Casey it. So I’m right there with you, John. Now to be fair, my MacBook Pro is very nice. I have no complaints. I’m just saying that

⏹️ ▶️ Casey that is the way corporate America works. You must conform, you must be one of the people in that 1984 commercial.

⏹️ ▶️ Casey You must be a a faceless number in the crowd that is using one

⏹️ ▶️ Casey of the blessed computers.

⏹️ ▶️ John I miss the days when I was unsupported, too, because when I was the first Mac in the company, like, all right, but you’re going to be totally unsupported.

⏹️ ▶️ John I’m like, yes, that’s

⏹️ ▶️ John, Marco exactly what I want. I want to

⏹️ ▶️ John be totally unsupported. Don’t touch my computer ever. Do nothing to it. Unfortunately, now the Macs are officially supported.

⏹️ ▶️ John I am beset on all sides by terrible things like add your Mac to the Active Directory

⏹️ ▶️ John network. No, no. All right. Run Symantec antivirus on your Mac. That’s real important

⏹️ ▶️ John run these other applications in the background that will grind your CPU to death as it does who knows what oh?

⏹️ ▶️ John It’s it’s really

⏹️ ▶️ Marco terrible. I have one question. Do you have to install Adobe acrobat?

⏹️ ▶️ John I don’t have to but it could be installed on my behalf silently in the background any day come to work

⏹️ ▶️ John Who knows what’s happened to my computer because it’s not under my control They just force install software on it whenever they want

⏹️ ▶️ John none of that software makes my experience of using the computer better But supposedly now I’m protected from

⏹️ ▶️ John viruses. Yeah, whatever.

⏹️ ▶️ Casey I can’t believe you stand for anyone else having any sort of access to your computer. Like I understand that you have no choice. I

⏹️ ▶️ Casey totally get that. But you of all people allowing that vulnerability, if you will,

⏹️ ▶️ Casey I am stunned that you that you stand for that.

⏹️ ▶️ John It’s rough. But what can you do? Like I said, you have no choice. The final bit is like, how sad is this?

⏹️ ▶️ John Really? It’s not really that sad. I installed Sierra on my 2008 Mac Pro and it runs fine because I have an aftermarket

⏹️ ▶️ John GPU. The installer is a little bit cranky about it but if you just like I’m installing on an external disk at this point because

⏹️ ▶️ John I’m not gonna install to my main system it’s the first developer release so I installed it on an external hard drive using my 2011 MacBook

⏹️ ▶️ John Air and then I just deleted a single file from the thing system library core services platform

⏹️ ▶️ John support that plist just no I renamed it or whatever and then my Mac Pro 2008

⏹️ ▶️ John boots from it just fine and it runs fine as far as I can tell so I don’t

⏹️ ▶️ John think it’ll be that big a deal I may end up doing the same thing with my Mac at work, although I’m sure corporate IT would

⏹️ ▶️ John love that. But, uh, it’ll, you know, I could try that right before I

⏹️ ▶️ John give up and say, okay, fine, get me the new, uh, uh, laptops or whatever. But anyway, if you have a 2008

⏹️ ▶️ John Mac Pro and other unsupported computer, uh, Sierra might work for you if the set of hardware

⏹️ ▶️ John you have is not the stock set of hardware that came with that computer.

⏹️ ▶️ Casey So now hold on a second. So you generally do some amount of work at home

⏹️ ▶️ Casey on a semi-regular basis, Is that correct? Mm-hmm. So you’re doing that on your personal machine

⏹️ ▶️ Casey because you have a Mac Pro at work and a Mac Pro at home, right? Yeah. So that’s cool with

⏹️ ▶️ Casey you? Like… It’s

⏹️ ▶️ John just an external hard drive. It’s just I have like seven hard drives connected to my computer. Like four internal

⏹️ ▶️ John and three external and half of them are turned off most of the time. The other ones are unmounted most of the time. It’s really easy to just put

⏹️ ▶️ John a beta OS on an external drive and then I just reboot and boot off of that drive and it’s fine. I immediately unmount

⏹️ ▶️ John all my other drives when I launch and it’s like it’s not a concern.

⏹️ ▶️ Casey I don’t know. I’m not on Sierra at all. I’m on a general level. You you have put you have infected

⏹️ ▶️ Casey your home computer with work related things is my point.

⏹️ ▶️ John Well, no, because I just remote desktop into my Mac at work to do stuff.

⏹️ ▶️ Casey Okay, so you have infected it with some sort of VPN connection.

⏹️ ▶️ Casey, John I

⏹️ ▶️ John have in fact the VPN software is installed for me to get on the VPN and after remote desktop.

⏹️ ▶️ Casey I gotcha. Okay. So any of the development isn’t really strictly speaking happening locally

⏹️ ▶️ John and really it’s not even happening on my Mac either. I mean and from then I’m just SSH ing into you know, like it’s

⏹️ ▶️ John I’m not, unlike you, I’m not actually doing development on my Mac really. I mean, except for like the web browsers

⏹️ ▶️ John that are on it, all the actual code and everything else is on servers, you know. Gotcha.

⏹️ ▶️ Casey All right. That’s actually fairly fascinating. I did not know that. Okay. Anything else about the Mac Pro

⏹️ ▶️ Casey support? It will never end. Anything else today? All right

Info from The Talk Show Live

⏹️ ▶️ Casey The talk show live. This was recorded Tuesday,

⏹️ ▶️ Casey I’m sorry, Tuesday before this passed. God knows by the time this thing is released it was probably two months ago.

⏹️ ▶️ Casey But anyway, the Tuesday that WWDC was going on, there

⏹️ ▶️ Casey was the talk show again this year. Was it two years ago that we were guests, I think? Yes. And

⏹️ ▶️ Casey then the only reasonable conclusion after having this impossibly

⏹️ ▶️ Casey awesome and attractive trio on stage is to have

⏹️ ▶️ Casey Phil Schiller, who followed up last year, and then this year, what are you gonna do? Well,

⏹️ ▶️ Casey why not have Phil Schiller and Craig Federighi? And we were all there, we were sitting

⏹️ ▶️ Casey in the front row, and it was amazing, and I cannot say enough

⏹️ ▶️ Casey good things about this talk show. If you are not a regular talk show listener, if you’re one of the 10 people that do

⏹️ ▶️ Casey not listen to the talk show, but do listen to this show, I cannot encourage you enough to

⏹️ ▶️ Casey listen to this episode of the talk show. We’ll have a link in the show notes. It is excellent. There’s also a video,

⏹️ ▶️ Casey which was very good from what I’ve gathered. I haven’t actually watched it myself, but seeing

⏹️ ▶️ Casey Craig and Phil talk to each other and talk to the audience and

⏹️ ▶️ Casey talk to John Gruber, all of the above, was amazing. There’s a few line items that

⏹️ ▶️ Casey we’d like to go through about that, but one of the things that made me kind of laugh and

⏹️ ▶️ Casey really that I loved about it and made me love Craig Federighi even more

⏹️ ▶️ Casey was as the talk show was going on, Craig on the surface looked

⏹️ ▶️ Casey comfortable, but if you really looked at him for more than a second, you realized he is not in love with

⏹️ ▶️ Casey what he’s doing right now. Not that John was doing anything wrong, not that Phil was doing anything wrong,

⏹️ ▶️ Casey but you could just tell that Craig was a little nervous and not entirely comfortable with what was was going

⏹️ ▶️ Casey on. And he was leaning forward a lot. He kept it. I did the same thing when we were on the talk

⏹️ ▶️ Casey show at DubDub. He kept leaning forward and then putting his

⏹️ ▶️ Casey elbow on his knee and propping his head up. And then I think he realized that that kind of looks

⏹️ ▶️ Casey a little weird for those in the audience. Then he would try to casually put his arm down, which I’m sure if we looked at the video

⏹️ ▶️ Casey of me on stage, I probably did this 35 times. But there was a lot of that.

⏹️ ▶️ Casey But the thing that struck me the most was when

⏹️ ▶️ Casey John Gruber talked about something developer related. Not only

⏹️ ▶️ Casey did Craig have impossibly good answers for all the questions, sometimes more forthcoming

⏹️ ▶️ Casey than maybe he should have, but he sat bolt upright every

⏹️ ▶️ Casey time. And he was so confident and so sure of himself. And it was so awesome

⏹️ ▶️ Casey to see. And God, I love that guy so much.

⏹️ ▶️ Marco Yeah, it was such a good show. I mean… It really was. You can think about

⏹️ ▶️ Marco what kind of show would you expect to have where Apple executives

⏹️ ▶️ Marco give an interview. And if you think about it just like that, you might think, well, they’re going to be

⏹️ ▶️ Marco going over PR talking points and everything. And certainly some of what they say is PR talking

⏹️ ▶️ Marco points, but most of it isn’t. And most of it is… I

⏹️ ▶️ Marco was on the talk show this week, I don’t think it’s out yet, but I basically reviewed the show for John

⏹️ ▶️ Marco on his own show because I knew he wouldn’t do it himself. So forgive the repetition here,

⏹️ ▶️ Marco but I think one of the best things I liked about it, first of all, is that

⏹️ ▶️ Marco we got to see two Apple executives interacting with each other. And you never see that.

⏹️ ▶️ Marco Even people in Apple hardly ever see that. And so to see that

⏹️ ▶️ Marco in this context, these two people who clearly have known each other for a very long time and have

⏹️ ▶️ Marco worked together for a very long time, two people on top of their game, I mean Phil and Craig are

⏹️ ▶️ Marco really excellent at their jobs. And they both have incredibly

⏹️ ▶️ Marco deep knowledge about the stuff that we care about and the stuff that Gruber would be asking questions

⏹️ ▶️ Marco about at WWDC. So I was thinking, we were kind of speculating beforehand, oh,

⏹️ ▶️ Marco who’s he gonna have this year? Who’s he gonna have this year? And one of the ideas is, what if he has Tim Cook?

⏹️ ▶️ Marco And having Tim Cook would be a great badge of honor. It would be a really noteworthy thing.

⏹️ ▶️ Marco But I think having Phil and Craig is actually a better set for that audience for that

⏹️ ▶️ Marco time. Because Phil is really, in many ways, it seems

⏹️ ▶️ Marco like he is kind of the head of product direction on some level. And

⏹️ ▶️ Marco it seems to be kind of a shared role in Steve’s absence. Johnny obviously has something

⏹️ ▶️ Marco to do with it. I think Tim still probably has something to do with it as well. And who knows who else? But it seems

⏹️ ▶️ Marco like a lot of that falls on Phil. So just product decisions in general. Also, Phil is the head

⏹️ ▶️ Marco of the App Store now. So again, very relevant to this audience. And then you have

⏹️ ▶️ Marco Federighi, who is… Yes, we knew he was the executive in charge of

⏹️ ▶️ Marco software engineering stuff. But I don’t think a lot of us knew quite how much of an engineer

⏹️ ▶️ Marco he is. And also, not only does he have deep technical chops,

⏹️ ▶️ Marco but he also has deep knowledge of specific implementation details of

⏹️ ▶️ Marco the stuff they’re doing now. So it’s not like he ascended into this ivory tower and is just

⏹️ ▶️ Marco dictating things down to his minions to have them do all the work and he’s just being a figurehead. He’s clearly still very much

⏹️ ▶️ Marco involved in those decisions and very, very, very technical

⏹️ ▶️ Marco down to deep levels and able to explain that very well.

⏹️ ▶️ Marco So to have that combo of the product guy and the app store guy

⏹️ ▶️ Marco in Phil and then to have Craig as the technical head is really,

⏹️ ▶️ Marco I think, the perfect combo to have at WWDC. Plus, I think their personalities work really

⏹️ ▶️ Marco well, both in general and also with John Gruber being the interviewer. So overall,

⏹️ ▶️ Marco I would much rather have them than have Tim in that context. And I think it was

⏹️ ▶️ Marco as good as it could possibly have gone. I think it was great.

⏹️ ▶️ John I mostly agree with that, that Tim would have been more boring, but I’ve been thinking about it more since WWC week. Like, it really

⏹️ ▶️ John does depend on the topic that you’re interested in. You know, if I’d mentioned Johnny Ive, obviously,

⏹️ ▶️ John if you’re going to have a bunch of questions about design stuff, he’s the guy you want, and

⏹️ ▶️ John Phil is not going to be able to give you much there and neither is Craig for that matter because like, you know, product design,

⏹️ ▶️ John whether it be hardware or software is Johnny’s domain and a lot of the decisions that are made there at there, you know, I would imagine

⏹️ ▶️ John that they could add some insight into them, but you want the man himself if you want to get that information. So there’s Johnny set aside.

⏹️ ▶️ John And then Tim, after Apple comes out with the car, the guy you want to talk to is Tim Cook,

⏹️ ▶️ John until you know like the deep car people because your main question is why has Apple chosen to make a car and the person who can answer

⏹️ ▶️ John that best is Tim Cook, right? Because that’s a decision at his level. So I would actually like to

⏹️ ▶️ John see Tim Cook at some point. But for the topics I was interested in this year and most years, yeah,

⏹️ ▶️ John I’d rather hear from Craig and to a lesser extent, Phil, because I’m much more interested in the technical and

⏹️ ▶️ John less interested in the app store stuff. But at WWDC, like you said, a lot of people are interested in app store.

⏹️ ▶️ John Exactly. The reason the talk show live is a topic in the follow up here is because I was adding

⏹️ ▶️ John items to the follow up of bits and pieces and I realized most of them were information from the talk show live.

⏹️ ▶️ John The main reason I would suggest people listen to it is that it has information that you probably

⏹️ ▶️ John won’t see elsewhere because you know again Phil and Craig were on there and they didn’t just say

⏹️ ▶️ John things that were already said at WWDC they didn’t just say things that have already been said by Apple PR

⏹️ ▶️ John they provided new information and some of it I thought was interesting and you won’t know it unless you

⏹️ ▶️ John listen to that or listen to these two tidbits that I pulled out of it the first one is about

⏹️ ▶️ John an iOS 10 how you can get rid of the Apple apps like if you want the stock apps or the tips

⏹️ ▶️ John app or whatever you don’t have to hide them in a folder anymore you can now delete them but they aren’t

⏹️ ▶️ John actually deleted as many people speculated and as has been speculated for a long time before the feature even appeared

⏹️ ▶️ John they’re not actually removed from the system even though when you delete them you can go to the store and then re-download them

⏹️ ▶️ John but as i think it was craig said on stage that download will run suspiciously fast

⏹️ ▶️ John, Marco because they

⏹️ ▶️ John, Casey haven’t actually been removed

⏹️ ▶️ John from your computer all the app store is saying oh yeah i’ll re-download those for you and then it just reveals them again and a bunch

⏹️ ▶️ John of reasons were given for actually leaving them on the system. You will delete, I think, the documents and data

⏹️ ▶️ John associated with it, it, but you won’t delete the actual executable. So, um, that was interesting tidbit.

Conquesting in search ads

⏹️ ▶️ John And the other one is relevant to a couple of stories and tweets that have been going around about the current beta

⏹️ ▶️ John of the App Store ad service, where people can try it out for free now.

⏹️ ▶️ John And all you do at this point is say, yes, I want to opt my app into the ad service. I don’t know, can you do

⏹️ ▶️ John the keywords and everything as well? But

⏹️ ▶️ Marco anyway, I looked around and couldn’t find anything else to do. So I opted my app in. But

⏹️ ▶️ Marco I think that just means that Apple will like kind of shove it on top of search results sometimes. But I don’t think

⏹️ ▶️ Marco I have any controls over that.

⏹️ ▶️ John Yeah, it’s like the automated one where they said if you don’t want to pick your keywords and doing that stuff Just we’ll figure it out for you. And so

⏹️ ▶️ John one of the little Controversies that spun up on Twitter was a I think it was someone

⏹️ ▶️ John from tweetbot was saying they search for tweetbot like the actual name Of their application and the number one

⏹️ ▶️ John hit to the ad at the top of the of the search results was an ad for Twitterific

⏹️ ▶️ John And it looks like Oh Twitterific has purchased the tweetbot keyword But they haven’t because you can’t even do that

⏹️ ▶️ John at that point and they didn’t And that brings up the whole topic that we discussed before about

⏹️ ▶️ John buying someone else’s trademark as a keyword. So if you have an application, if you were, you

⏹️ ▶️ John know, the maker of Twitterific and you wanted to buy the word, the keyword tweet bot, could

⏹️ ▶️ John you do that? That was addressed by Phil Schiller and the talk show live.

⏹️ ▶️ John And he said that they are explicitly allowing this. I think this is a direct quote from the transcript. You

⏹️ ▶️ John can use someone else’s brand in your ad words that you want. we thought about it, that

⏹️ ▶️ John is more likely to benefit the small developer than the big developer. The idea is that if

⏹️ ▶️ John you have a well known brand, if you’re like clash of clans or something, you’re not going to buy a keyword

⏹️ ▶️ John for some tiny little application that doesn’t tiny little similar game that doesn’t sell a lot of copies. But if you make

⏹️ ▶️ John a game that’s similar to clash of clans, and you want to break into the market, and really right now you don’t have any customers,

⏹️ ▶️ John maybe you buy the clash of clans keyword. That is the theory behind this and Phil articulates it as

⏹️ ▶️ John in a most straightforward manner as you’re going to get. And it was surprising to a lot of people and we’ll see how it plays out in real life.

⏹️ ▶️ John But from the perspective of app developers, especially app developers who sort of on our

⏹️ ▶️ John uneven footing, it seems like a pretty scummy thing to do to buy your competitors trademark.

⏹️ ▶️ John So I think a lot of people are not going to do it because it seems rude. And I would imagine the companies that

⏹️ ▶️ John would do it are exactly the companies that are big because they have no scruples and aren’t run by an individual

⏹️ ▶️ John who is answerable to their practices and they have a marketing department who’s measured on how well they boost sales.

⏹️ ▶️ John And so I think this is going to happen a lot. I’m not entirely convinced that it’s going to benefit the small developer

⏹️ ▶️ John more than the big developer. But we’ll see. And again, Phil talked directly about this issue when questioned

⏹️ ▶️ John about it on Talks Real Live, so check it

⏹️ ▶️ Marco out. I mean, I think there’s a number of, you know, issues and factors that go into this.

⏹️ ▶️ Marco exactly is a small developer because I think who this will actually hurt most

⏹️ ▶️ Marco are people like Twitterific and Tweetbot and maybe Overcast

⏹️ ▶️ Marco where kind of like medium-sized developers we’re like we’re not big enough for like the clash of clans

⏹️ ▶️ Marco people to bid on our keywords because who cares like you know we are nothing compared to what they get and

⏹️ ▶️ Marco we’re also not related but other people who make you know who make apps that are

⏹️ ▶️ Marco related to podcasting or related to Twitter would very easily bid on

⏹️ ▶️ Marco the medium-sized apps in their category and on those keywords. I view myself

⏹️ ▶️ Marco as a small developer in the grand scheme of things, but a lot of people view Overcast as a

⏹️ ▶️ Marco big app in the category relative to their app. I think

⏹️ ▶️ Marco when Phil says he thinks this will benefit small developers, I don’t think he’s talking about people like

⏹️ ▶️ Marco Overcast, Tweetbot, and Twitterific. I think he’s talking about little things that most of us have never

⏹️ ▶️ Marco heard of that happen to be in these same categories. He’s right that it would probably benefit them.

⏹️ ▶️ Marco Like, you know, the kind of thing where like if you search for like Twitter or whatever, you know, you get a couple

⏹️ ▶️ Marco of Twitter apps and then you get a whole bunch of what looks like just garbage and spam and it’s like you know photo rotation

⏹️ ▶️ Marco apps for Twitter or stuff like that, you know, that kind of stuff. It’s gonna be that kind of

⏹️ ▶️ Marco app, I think, that could benefit from this. Not medium-sized apps like

⏹️ ▶️ Marco Twitter, Facebook, Tweetbot, Overcast. The other Another problem here is that we’ve had download keyword

⏹️ ▶️ Marco spam issues in iTunes Connect since

⏹️ ▶️ Marco the beginning of the App Store. The apps have a keywords

⏹️ ▶️ Marco field that they can enter keywords into for search, and those have historically been used

⏹️ ▶️ Marco pretty strongly for relevance in search, more so than the description that is publicly shown.

⏹️ ▶️ Marco The problem is these keywords are not publicly visible anywhere, and so there’s no downside

⏹️ ▶️ Marco for an app to put its competitors’ or other large apps’ names in those keywords.

⏹️ ▶️ Marco Apple has a rule against this, but in practice that rule is only really enforced and enforceable

⏹️ ▶️ Marco against large trademarks like Disney. It’d be hard to put Disney in your keywords

⏹️ ▶️ Marco and get accepted through app review, but if If another Twitter app puts

⏹️ ▶️ Marco tweetbot in its keywords, that is way more likely to get in and to get through.

⏹️ ▶️ Marco Historically, that has been seemingly very badly enforced by AppReview. Phil even said

⏹️ ▶️ Marco something along the lines of, we’re going to be trying to step that up. Something to

⏹️ ▶️ Marco that degree. Then thirdly, the problem is,

⏹️ ▶️ Marco if you put down overcast as your keyword, and you might search for the word overcast to test this

⏹️ ▶️ Marco out. And then my app shows up below some ad. And you might think this is horrible.

⏹️ ▶️ Marco My app is not called overcast in the app store. My app is called overcast colon podcast

⏹️ ▶️ Marco player. And if you look at almost every other instance of this kind of like ad

⏹️ ▶️ Marco doing what seems like the wrong thing, like Tweetbot is not just named Tweetbot. Twitterific

⏹️ ▶️ Marco is not just named Twitterific. So it’s called like, you know, Tweetbot for Twitter, Twitterific, Twitter client,

⏹️ ▶️ Marco stuff like that. Like we’ve, you know, in the last couple of years, we’ve all kind

⏹️ ▶️ Marco of succumbed to having to add keywords to our app names because quite simply

⏹️ ▶️ Marco it works better and the search has been horrible and it’s been one of the only ways.

⏹️ ▶️ John You all gave up on the search improving. You’re like, you know what, forget it. We will mangle the name of our

⏹️ ▶️ John application because there’s no other way to come up in the search results. I mean, I remember when Twitterific did it years ago.

⏹️ ▶️ John Like it was a sad day because it just really uglies up the name. But the problem was people would search for Twitter and just

⏹️ ▶️ John Twitter if it would be 700 results down like the relevance was so bad, even though it was a popular

⏹️ ▶️ John third party Twitter client. And when someone searches for Twitter, it would be a good idea to show them the you

⏹️ ▶️ John know, well reviewed popular frequently updated Twitter clients. And instead,

⏹️ ▶️ John it’s like no Twitter doesn’t appear anywhere in your name, or it does, but it’s just a prefix. And like the fact that

⏹️ ▶️ John Twitter if it didn’t show up in a search result for Twitter until way low down. That’s just sad.

⏹️ ▶️ John that is part of this effort to like police the keywords for trademarks more often and stuff they will also

⏹️ ▶️ John improve the search to the point where you don’t have to name mangle essentially anymore but we’ll see

⏹️ ▶️ Marco well the other thing is like I try with overcast I tried not name mangling but you know I launched overcast

⏹️ ▶️ Marco fairly late in the app store relative to these other apps there was already an app named exactly the string

⏹️ ▶️ Marco overcast in fact I even contacted the developer and tried to buy it from him and he agreed and we had agreed

⏹️ ▶️ Marco to actually transfer this app to me just so I could delete it and reuse the name. But the app

⏹️ ▶️ Marco used iCloud, and so we couldn’t transfer it.

⏹️ ▶️ John Why didn’t you just pay him to rename it?

⏹️ ▶️ Marco Somehow we just dropped the ball on continuing the conversation. It probably just fell apart

⏹️ ▶️ Marco just through apathy. But there is an app called Overcast. I believe the seller’s name

⏹️ ▶️ Marco is Willis Ingersoll, something like that. And it’s a file transfer app for cloud services. And The App

⏹️ ▶️ Marco Store, even though the World of Trademark allows for the same name

⏹️ ▶️ Marco to be used in different industries and different contexts, the App Store doesn’t, unless you

⏹️ ▶️ Marco do weird tricks with international titles. So it sometimes does, but usually doesn’t. It’s a mess.

⏹️ ▶️ Marco It’s such a mess. So this is the combination of

⏹️ ▶️ Marco lots of longstanding App Store problems, results in this

⏹️ ▶️ Marco mediocre situation we have with titles with keywords after them, along with the

⏹️ ▶️ Marco new ads not matching things exactly, Probably, yeah, it’s going to be interesting.

Apple Photos keyword searching

⏹️ ▶️ Casey Let’s talk about Apple Photos and the John Syracuse

⏹️ ▶️ Casey controller search.

⏹️ ▶️ John Yeah, that was my quality metric. Well, the original problem that I had was

⏹️ ▶️ John that I couldn’t find a picture of my PlayStation 4 controller in Apple’s Photos application. This is

⏹️ ▶️ John before iOS 10, before W3C. So I put all my photos into Google Photos, and amazingly, Google Photos was able to find

⏹️ ▶️ John it just by me typing the word controller. and of course Apple announced similar photo

⏹️ ▶️ John search for iOS 10 and Mac OS Sierra and

⏹️ ▶️ John I speculated on last show that it would not be able to find controller by me

⏹️ ▶️ John typing controller because Apple is not good at that type of thing and Google is so now having

⏹️ ▶️ John installed oh actually I haven’t installed it many other people have installed I’ve installed my

⏹️ ▶️ John go Sierra but I haven’t sold iOS 10 many other people have installed all these betas onto their iPhones

⏹️ ▶️ John because they’re very brave people and they have taken my exact pictures of the controller they pulled them from Twitter

⏹️ ▶️ John and put them into their photo libraries and done the search to see if we would find it and it doesn’t find if you type controller

⏹️ ▶️ John but because Apple search Apple’s photo search works a little bit differently or at least presents a different UI than Google

⏹️ ▶️ John photo search it does find it when you type control stick or I think also gamepad

⏹️ ▶️ John and this brings up the the UI that Apple rises as you type it shows is like category

⏹️ ▶️ John results as if it puts things into buckets and indeed it does Kayin has a blog post

⏹️ ▶️ John that lists out all of the buckets that it has has something like 4,000 buckets

⏹️ ▶️ John to put things in there’s a lot of stuff in there dance palaces

⏹️ ▶️ John banana chairlifts wait what’s a dance palace yeah seriously I don’t know I’m just reading

⏹️ ▶️ John some of the keywords here Orchid, kiln, it’s got a kiln, that’s a good one. Polos, blocks,

⏹️ ▶️ John pulley, trikes, trumpets, tuba, bass horn. I mean, like it’s got huge amounts

⏹️ ▶️ John of things. But basically, if you type a word that is not one of the words on this list, it won’t find it. So as you

⏹️ ▶️ John type, it’s showing you the words that begin with or contain these things. So it’s a little bit different UI, but it’s trying

⏹️ ▶️ John to lead you to its notion of those things. So if you type, start typing controller and you see control stick,

⏹️ ▶️ John you probably tap that one. Um, so good job, uh, Apple

⏹️ ▶️ John photos so far on finding the controller that I didn’t think you would ever find because that’s, that’s good

⏹️ ▶️ John photo recognition because the controller was upside down. Uh, that’s not an easy search and it’s a pretty obscure category. On the other

⏹️ ▶️ John hand, someone also said that Apple’s photo search stinks because it couldn’t find birthday pics or babies.

⏹️ ▶️ John Uh, or pictures of a couch. So I don’t know if their thing just hadn’t indexed everything yet, or those things aren’t among the keywords

⏹️ ▶️ John or anything like that, but we’ll see as this rolls out

⏹️ ▶️ Marco but you can finally your dance palaces if you have pictures of dance palaces it will find that before it will find your couch

⏹️ ▶️ John not quite sure what would come up for a dance palace but yeah I you know I have my entire photo library in Google Photos

⏹️ ▶️ John and also my entire photo library in Apple’s photos so when these things release for real I can do a b-test

⏹️ ▶️ John right next to each other with the exact same library and do a little bake-off and when that happens I will definitely

⏹️ ▶️ John have some results but so far so good for Apple It is more impressive than I thought it would be.

⏹️ ▶️ John I’m looking

⏹️ ▶️ Marco up what a dance palace is. is.

⏹️ ▶️ Marco Go to fractureme.com and check it out with code ATP10 for 10% off. Fracture is a

⏹️ ▶️ Marco company that prints photos in vivid color directly onto panes of glass. It

⏹️ ▶️ Marco looks fantastic. The colors pop like you won’t believe, and it comes in a nice solid foam

⏹️ ▶️ Marco backing that’s ready to mount right onto the package. So here’s how this works. It looks just like a complete

⏹️ ▶️ Marco picture. You don’t have to frame it. You don’t have to do anything else to it. It is a complete picture. It is a rectangle

⏹️ ▶️ Marco or square of glass with your photo printed right behind the front layer of glass. and then behind that

⏹️ ▶️ Marco is a little tiny bit of foam board so that you can hang it easily because then things can

⏹️ ▶️ Marco hook into the foam board without having to scratch the glass or the ink layer. So it just looks fantastic. It looks

⏹️ ▶️ Marco very modern and clean and I have these all over our house, our office.

⏹️ ▶️ Marco They’re everywhere. People always compliment them. They make great gifts. They make great keepsakes. You can get

⏹️ ▶️ Marco your photos and actually print them and actually have a physical representation so you can remember this photo for more than

⏹️ ▶️ Marco the two days it sits in your Facebook feed or whatever. Check it out, the prices are very reasonable for Fractures.

⏹️ ▶️ Marco They start at just $15 for their small square size, which is great for Instagram prints.

⏹️ ▶️ Marco They also make non-square rectangles. They are awesome. Again, they make fantastic gifts. We’ve given

⏹️ ▶️ Marco them as gifts many times. These prints look fantastic from Fracture. So check it out today

⏹️ ▶️ Marco at FractureMe.com and use code 10%off with the code ATP10.

⏹️ ▶️ Marco I recommend them. They’re great.

Thunderbolt Display discontinued

⏹️ ▶️ Marco Check out Fracture. Thanks a lot. All

⏹️ ▶️ Casey right, so some slightly late-breaking news before we recorded. The

⏹️ ▶️ Casey Thunderbolt display, the thing that I’ve been lusting after since September 2011, when

⏹️ ▶️ Casey it was brand new, which was the last time it was updated. Anyway.

⏹️ ▶️ Marco Even that was a minor update over the LED display that came out right before it.

⏹️ ▶️ Casey Yeah, fair enough. The Thunderbolt display has been discontinued. Sad times. So at this

⏹️ ▶️ Casey point, Apple does not sell an external display.

⏹️ ▶️ John They did have a minor update where they started shipping the MagSafe 1 to 2 adapter

⏹️ ▶️ John in the box.

⏹️ ▶️ John, Casey It’s a hardware change, right?

⏹️ ▶️ Casey Something like

⏹️ ▶️ Casey, Marco that.

⏹️ ▶️ Marco So they never actually gave it the MagSafe 2, like natively, they just kind of shipped through this $8 adapter in the box. Yeah.

⏹️ ▶️ Casey Excuse me, I believe that adapter is $10. Thank you very much. Sorry, you’re right. You’re right.

⏹️ ▶️ Casey, Marco So

⏹️ ▶️ Casey this is somewhat sad times. If you fancied having a non-retina

⏹️ ▶️ Casey external display that’s very pretty to look at and presumably for non-retina being very,

⏹️ ▶️ Casey very pretty to see what’s on the screen as well. However, does this make us think that

⏹️ ▶️ Casey maybe a retina display is coming soon? Maybe with a GPU built in? I don’t

⏹️ ▶️ Casey know. I would think so, but it seems odd to me that they would discontinue without having something to

⏹️ ▶️ Casey replace it.

⏹️ ▶️ John Yeah, maybe they just don’t sell enough of them. Maybe they can’t get that screen from the vendor anymore because it’s so terrible because the pixels

⏹️ ▶️ John are the size of boulders. I don’t know. Did you read Apple’s official statement that I put in the notes?

⏹️ ▶️ Casey Yeah, it was really short, wasn’t it? Where did it go? We are discontinuing the Apple Thunderbolt display. It will be available

⏹️ ▶️ Casey through Apple.com, Apple’s retail stores, and Apple authorized resellers while supplies last. There are a number of great

⏹️ ▶️ Casey third-party options available for Mac users.

⏹️ ▶️ Marco It’s kind of surprising that they even gave a statement about it, that they called any attention

⏹️ ▶️ Marco to it at all. when Apple discontinues a product, like one that’s not very important,

⏹️ ▶️ Marco they usually just kind of quietly remove it from sale. Like I think it’s

⏹️ ▶️ Marco, John unusual to- But there’s usually a

⏹️ ▶️ John clear replacement when they do

⏹️ ▶️ John, Marco that. That’s true. Like they

⏹️ ▶️ John remove the old one and there’s always the new one and this is a little bit weird where they’re removing the old one without a new one to replace

⏹️ ▶️ John it. I still put the odds at greater than 50% that there will be a replacement for this

⏹️ ▶️ John product.

⏹️ ▶️ John, Marco Oh yeah.

⏹️ ▶️ John Just because it’s such a gimme, they have the screen, it’s in the 5E iMac, we’ve been staring at it for a long time, you put

⏹️ ▶️ John it in a case, you just ship it like but maybe that’s not ready yet. And maybe it’s part of a harbor announcement. And

⏹️ ▶️ John for whatever reason, that’s like, I think, I don’t think they sell a lot of Thunderbolt displays.

⏹️ ▶️ John It’s $1,000 monitor that is not worth $1,000 at this point, like arguably, you could argue

⏹️ ▶️ John about it when it was first introduced, but it is a really nice monitor. And back before there were no options for 27 inch retinas when

⏹️ ▶️ John it was introduced in 2011. So it was good, whether it was worth $1,000, who knows, but at this point,

⏹️ ▶️ John it’s definitely not worth $1,000. So they must not have been selling a lot of them. And as they note, you know, while

⏹️ ▶️ John supplies last, they need to get rid of all this inventory. Maybe it takes this long to get rid of the inventory at the

⏹️ ▶️ John rate that they sell. Like maybe they have a lot of Thunderbolt displays hanging around and they did the math and said, if we

⏹️ ▶️ John want these flushed out of the inventory by the time we introduced the 5k external display, we need to discontinue it now

⏹️ ▶️ John and just let them sell out the rest of them. It would also help by the way, Apple, if you lowered the price, cause that’s a way that you can sell things

⏹️ ▶️ John faster, but I know you don’t do that.

⏹️ ▶️ Marco No, I mean, I think it’s more likely that like, you know, the, the rumors were all really clear

⏹️ ▶️ Marco and the timing makes a lot of sense that the new Skylake-based MacBook Pros

⏹️ ▶️ Marco were supposed to be out by now. And it seemed that they were probably delayed because of various Intel

⏹️ ▶️ Marco issues. You know, like, have you heard… Every time we mention Skylake and laptops, we usually hear from people

⏹️ ▶️ Marco who tell us that Skylake chips that would be appropriate

⏹️ ▶️ Marco for the 15-inch were kind of out in the PC world, and they actually had tons of problems.

⏹️ ▶️ Marco And it seemed It seems like Intel might have partly recalled them or something. There were some

⏹️ ▶️ Marco problems with them that the manufacturers were having even after they chipped. So it seems like

⏹️ ▶️ Marco there was some kind of unexpected Intel delay that caused the MacBook Pros to most likely be delayed.

⏹️ ▶️ Marco So it seems like if Apple is going to release a new 5K display, it would need Skylake most likely to drive

⏹️ ▶️ Marco it, so that they were probably planning on releasing the 5K external display that succeeded

⏹️ ▶️ Marco this Thunderbolt display at the same time as new MacBook Pros that were all supposed to already be out by

⏹️ ▶️ Marco now. So chances are they probably gave the order months

⏹️ ▶️ Marco ago to stop taking these component orders, stop making these things, stop manufacturing these,

⏹️ ▶️ Marco and then this delay happened sometime in the meantime. This was probably just like the replacement

⏹️ ▶️ Marco product was supposed to be out by now, and it might be done, but there’s no computer that can drive it yet.

⏹️ ▶️ Marco So they’re not going to release a display with no computer that can drive it. So they kind of held the display

⏹️ ▶️ Marco for the MacBook Pro, which got delayed. So now they just have like, well, we already

⏹️ ▶️ Marco told the manufacturing line to stop making these and we’re not going to start it back up again because that’s

⏹️ ▶️ Marco crazy. So I’m guessing we’re just in this weird hole in the middle of this, like,

⏹️ ▶️ Marco you know, this, this, these weird delays cause this, this operations hole to happen and

⏹️ ▶️ Marco it’ll make sense in a few months or whatever, you know, whenever the new MacBook Pros come out.

⏹️ ▶️ John And by the way, I think the reason they’re going to make this product is everyone’s like, maybe they’re just not going to sell external displays. The reason they’ll make it is because

⏹️ ▶️ John the margins have to be pretty darn good.

⏹️ ▶️ John, Marco Oh yeah.

⏹️ ▶️ John Probably going to sell it for the same you know, for the same kind of margin Thunderbolt display, like I said, it was a thousand dollar

⏹️ ▶️ John monitor in an age when you could get monitors with sometimes monitors with the exact same panel in them for

⏹️ ▶️ John less money if you knew the right, you know, the right weirdly numbered Dell thing or whatever to buy like

⏹️ ▶️ John good healthy margins on those. And Apple sells mostly laptops for people who buy Macs.

⏹️ ▶️ John And people with laptops often want a really big external display like they don’t want an external display that’s like 17

⏹️ ▶️ John inches. But if people want an external display, they want a really big one. And so Apple should sell you

⏹️ ▶️ John a really big one because it’s you know, the same reason they sell battery cases for crying out loud. If there’s, this is a product that people

⏹️ ▶️ John want, and we can sell it at high margin, and it can look nice and Johnny, I won’t won’t be upset

⏹️ ▶️ John by someone connecting a disgusting looking Dell monitor to their beautiful MacBook Pro. Like all signs

⏹️ ▶️ John point towards Apple continuing to sell one really big, really expensive high margin external

⏹️ ▶️ John monitor and all signs point to me buying one once I can hook it up to a computer.

⏹️ ▶️ Marco We go to hover.com and use promo code APFS at checkout for 10%

⏹️ ▶️ Marco off your first purchase. Hover is a great place to buy domain names. So

⏹️ ▶️ Marco when you have a great idea for your project or blog or startup, whatever, you need to give it a great domain

⏹️ ▶️ Marco name. And finding that name is very, very easy with Hover. And of course, they make it very easy to then buy it and manage

⏹️ ▶️ Marco it. And I use Hover for tons of my domains and it’s great. They have over 400 domain extensions to end

⏹️ ▶️ Marco your domain with. the classics like.com,.net, some of the new narrow

⏹️ ▶️ Marco ones like design and tech, and some of the weird kind of jokey ones like.pizza and.ninja.

⏹️ ▶️ Marco Now new this week, they just launched.store. And this is one of the rare

⏹️ ▶️ Marco occasions where there’s a name that is straightforward. It’s not goofy or amateurish

⏹️ ▶️ Marco like.ninja and.pizza. You can build a real business on something that ends in.store. This is

⏹️ ▶️ Marco a very rare occasion where brand new real estate has opened up that ends in a.store. domain extension that

⏹️ ▶️ Marco is not dumb. So it’s wide open because it just opened up. So go to hover.com,

⏹️ ▶️ Marco go get a dot store name. And you have a very, very good chance now because it’s brand new of getting a

⏹️ ▶️ Marco really good one. So whether it’s dot store or any of the other 400 domain extensions out there, you can find

⏹️ ▶️ Marco the perfect domain name for your idea at hover, go to hover.com and use promo code APFS

⏹️ ▶️ Marco at checkout, which stands for I believe, Apple platform file system.

APFS: Schedule and migration

⏹️ ▶️ Marco APFS at checkout to save 10% off your first purchase. Thanks to Hover for sponsoring our show.

⏹️ ▶️ Casey John, tell us about Apple File System.

⏹️ ▶️ John Where are you getting the platform part from? Apple Platform File System.

⏹️ ▶️ Casey Somebody

⏹️ ▶️ John said that.

⏹️ ▶️ Casey I

⏹️ ▶️ John thought I’d heard that as well. A lot of people said a lot of things. I’m like, is that from official? I haven’t heard any official thing. I think

⏹️ ▶️ John it’s just for Apple. Anyway, there’s a WWDC session that is entirely open to the public, session 701.

⏹️ ▶️ John We will put a link in the show notes so that if you’re interested in Apple file system, you can watch this presentation.

⏹️ ▶️ John At the time we recorded the WWDC episode, we had not seen this presentation yet because it was two days

⏹️ ▶️ John in the future. And there wasn’t much new information in that thing that wasn’t already in the

⏹️ ▶️ John State of the Union, but there were two tidbits worth discussing. One is,

⏹️ ▶️ John all right, so APFS, as we saw in the State of the Union, is like a, will be a developer preview.

⏹️ ▶️ John And I think a lot of people were confused by that, like, okay, so everything’s developer preview, right? But now when Sierra ships

⏹️ ▶️ John for real, like the final version of ships to customers, APFS will be included in Sierra

⏹️ ▶️ John as a developer preview, as in you will get the official release version of Sierra. And part of

⏹️ ▶️ John it of that product will be provisional developer level support for APFS.

⏹️ ▶️ John So what does that mean for, you know, or how, how far in the future is APFS? As I said, in the state of the union,

⏹️ ▶️ John APFS is developer preview this year and coming in 2017. but what does that even mean? A slide

⏹️ ▶️ John in session 701 had these words on it, and we’ll see how this rolls out.

⏹️ ▶️ John APFS will be the default file system for all Apple products in 2017. So

⏹️ ▶️ John that is pretty unequivocal, except for the part where it says 2017. So default file system for all Apple product

⏹️ ▶️ John means if you buy an Apple product in, you know, after whatever this point is,

⏹️ ▶️ John it will be formatted with APFS. That includes all iPhones, all iPads, all Macs, everything.

⏹️ ▶️ John That’s the implication of the statement in 2017 could mean anything. It could mean in January starting January 1 2017.

⏹️ ▶️ John It could mean starting December 31 2017. So there’s a big range and when this

⏹️ ▶️ John could happen. But it is a clear pretty clear statement of intent

⏹️ ▶️ John and a publicly available video from official Apple spokespeople that their goal is 2017

⏹️ ▶️ John not just that, oh, you’ll be able to format as a PFS for the default will still be HFS plus or only on Macs

⏹️ ▶️ John or whatever. Nope, every single product, every single device that has a file system across the entire product range.

⏹️ ▶️ John At some point, this is their goal. Anyway, we’ll see if they meet it. So that seemed to me like an ambitious

⏹️ ▶️ John goal. But I’m also kind of excited about it. Because as we all know, 2017 is the year of the file system.

⏹️ ▶️ John And I hope they take their time because that’s a little bit I’m a little bit scared

⏹️ ▶️ John for them. Like, that means like when when the next iPhone comes out, or whatever the 2017 iPhone,

⏹️ ▶️ John all of those millions and millions of iPhones all going to be formatted with APFS. I

⏹️ ▶️ John really hope that goes well. You know, or I suppose they could say, okay, well, again, if they go to December

⏹️ ▶️ John 31st, 2017, then next year’s iPhone will ship with HFS plus and then December 31st starting

⏹️ ▶️ John that all the products will be formatted that way.

⏹️ ▶️ Casey So in case anyone at Apple, anyone who knows any of the Apple file system people

⏹️ ▶️ Casey is listening right now, consider that John Syracuse, your friend and mine has been asking

⏹️ ▶️ Casey for a new file system for years has been begging for a new file system. And

⏹️ ▶️ Casey as he just said, you guys, you know, Apple has said in 2017, I would just

⏹️ ▶️ Casey like to point out that John Syracuse’s birthday happens to be

⏹️ ▶️ Casey the very last day of 2017. If you just decided, you know, maybe that would

⏹️ ▶️ Casey be the appropriate day to flip the switch and hit the big red button, that would be a pretty solid birthday present, a much

⏹️ ▶️ Casey better present than I’ve ever gotten, John. Just throwing it out there, guys. You run with it.

⏹️ ▶️ John Yeah, I think they’re gonna go before that and my impression is that

⏹️ ▶️ John This date even though it seems ambitious to us is not really that ambitious from the perspective of Apple

⏹️ ▶️ John who has been working on this Fossil for a long time obviously like they didn’t just announce it and they started working on it last week, right?

⏹️ ▶️ John So the impression I get from WWDC is that

⏹️ ▶️ John 2017 is not rushing it and the really they’ve got the whole year It’s like, you know it’s a relaxing expanse of time

⏹️ ▶️ John in which they can plan their rollout of this new file system. And I really hope that’s true, because if so, that’s exactly how you

⏹️ ▶️ John want to do this. Uh, the second tidbit that was in this WWDC session, which I’m sure everyone knows

⏹️ ▶️ John by now, because it’s been on all of the Apple news sites, uh, is the question of if you don’t buy

⏹️ ▶️ John a new piece of hardware, you know, which comes with APFS, but you have an existing piece of hardware,

⏹️ ▶️ John how do you, uh, get APFS? Like I can only think about it from the perspective of a Mac,

⏹️ ▶️ John because I don’t imagine they’re gonna let people like a reformat their iPhones like if you restore your iPhone

⏹️ ▶️ John in 2017 in the post APFS world is a reformatted with APFS I doubt it but certainly for the Mac

⏹️ ▶️ John we know this is a straightforward situation the way you could do it is you back up your entire hard

⏹️ ▶️ John drive you erase it you reformatted as APFS and you restore your hard drive on it you can do a time machine you can do a super

⏹️ ▶️ John duper with lots of different solutions to doing this all those should be possible but in the 701 presentation

⏹️ ▶️ John WDAC, Apple said that they have something else in mind, and that is in-place conversion

⏹️ ▶️ John of your volume format from HFS plus to APFS. So you’ve got a Mac, this

⏹️ ▶️ John file system comes out in 2017, it’s officially released, all the new hardware you buy comes with it, you’re like, I want to try that new

⏹️ ▶️ John file system, what do I do? Well first of all I would advise you to make a backup anyway, and maybe

⏹️ ▶️ John two backups, and maybe three backups. But you won’t have to erase your disk. You’ll be able to

⏹️ ▶️ John take your existing disk that’s formatted with HFS plus and do a thing

⏹️ ▶️ John and reboot and when it comes back all your data will still all be there you won’t

⏹️ ▶️ John have erased it and it will be an APFS volume and the the concept of this

⏹️ ▶️ John is freaking people out a lot I think I talked about the old Alsoft plus maker thing that would do the same conversion

⏹️ ▶️ John for from HFS to HFS plus in place and that was terrifying because if it’s screwed up in the middle you were left with nothing.

⏹️ ▶️ John This the situation with APS is a little bit different and you shouldn’t actually be

⏹️ ▶️ John as terrified as you are. It actually is if you’ve ever seen from

⏹️ ▶️ John I don’t know was this on ATP or maybe it was hypercritical the btrfs

⏹️ ▶️ John demo of a similar type of in place conversion you it’s the same type of deal because

⏹️ ▶️ John of the way this volume format works it’s actually fairly safe. So here’s the process

⏹️ ▶️ John and here’s why it will not probably destroy your data, although it will definitely make backups.

⏹️ ▶️ John So the

⏹️ ▶️ John, Marco first thing it’s going to

⏹️ ▶️ John do is it’s going to unmount your volume. So obviously you can’t be booted from your volume to reformat it. You have to be booted from the recovery

⏹️ ▶️ John partition or something like that or whatever. But anyway, once it’s unmounted, no more changes are happening to

⏹️ ▶️ John this volume that you’re going to convert. So you don’t have to worry about chasing after some other change. It’s not going to happen live

⏹️ ▶️ John like the, what do you call it, the file vault conversion where you can encrypt a disk while you’re using it, which is kind of magical. is like

⏹️ ▶️ John unmounted. All right. Next thing is, it’s going to write all the APFS metadata

⏹️ ▶️ John structures, all the little pointers to where does this file begin? Where does it end? How many blocks are in it?

⏹️ ▶️ John What is the file called? What are all the dates about the file? It’s going to write all that metadata to the free space on the

⏹️ ▶️ John HFS plus volume. So you can only do this conversion if you have enough free space to fit the metadata. I don’t know how much it’s going to take, but

⏹️ ▶️ John you do need some free space. And hopefully you’re, you’re just using filled to the brim anyway. So it’s writing

⏹️ ▶️ John just to the free area. So at any point during this process, when it’s writing all this metadata, when

⏹️ ▶️ John it’s reading your HFS plus volume, finding out where everything is, what everything is called, what all the dates are, what all the permissions are,

⏹️ ▶️ John all the ownership, all everything about these files, all the different forks, all the extended attributes, all that stuff, and it’s

⏹️ ▶️ John writing that metadata out in APFS format to the free space. If at any time something bad happens

⏹️ ▶️ John during this, like someone yanks the plug in your computer, your HFS plus volume is fine. Because if you reboot into it,

⏹️ ▶️ John it doesn’t care what’s in the free space. there’s garbage in the free space. It’s free space on the HFS plus file. Nothing is wrong with

⏹️ ▶️ John it. It’s perfectly fine. This is the majority of the time it spends is looking at all your things and

⏹️ ▶️ John writing all the metadata for all the things on your file system. When it’s finally

⏹️ ▶️ John finished doing that, all it needs to do is this is sort of the critical section when things can go wrong, and the

⏹️ ▶️ John critical section should be very, very short. It’s basically delete the HFS plus super block, update

⏹️ ▶️ John the partition type to be APFS, put the correct UUID in there, and you’re done. Like, You just have to do

⏹️ ▶️ John a switcheroo in the front of the whole file system and say, OK, now you’re not HFS anymore. Now you’re APFS.

⏹️ ▶️ John And then you reboot. And that critical section of when something can go wrong, like if they were halfway through, and I bet even that probably

⏹️ ▶️ John is fairly well protected because, well, I guess once you delete the HFS plus

⏹️ ▶️ John superblock, you’re kind of screwed. But anyway, this is a very small window of time when something can go wrong. All the

⏹️ ▶️ John rest of the time during the conversion, everything is fine. And I bet you could probably restore the HFS plus superblock if you were

⏹️ ▶️ John clever enough, For some recovery program could do that as well because again you haven’t touched any of the actual data

⏹️ ▶️ John Your actual files have not been modified the HFS plus metadata has not been modified None of that stuff has

⏹️ ▶️ John changed just sitting exactly where it was All you got to do is Change sort of the thing

⏹️ ▶️ John that tells you where everything else in the file system is the superblock is metadata about the whole file system That tells it you know

⏹️ ▶️ John how big the file system is and where all the other metadata structures are and all that stuff And by the way

⏹️ ▶️ John speaking of the APS UUID the thing that tells it like what volume type this is one other tidbit that

⏹️ ▶️ John was sort of alluded to in State of the Union and I think in the 701 thing is that if you have

⏹️ ▶️ John an APS APFS volume and you try to mount it on

⏹️ ▶️ John like an older Mac operating system it’ll say this disk is unreadable do you want to initialize it all that business but if you

⏹️ ▶️ John try to do that on El Capitan it won’t do that not because I’ll can’t be to have secret support for APFS but just

⏹️ ▶️ John because it knows it recognizes the UUID and so it won’t offer to erase it for you and And that’s I guess

⏹️ ▶️ John to try to prevent people from accidentally erasing their newly converted disk when they don’t realize that that prompt

⏹️ ▶️ John asking them to Erase it was actually asking them to erase the disk. They just converted So that’s that’s

⏹️ ▶️ John kind of clever and that’s another secret thing you would never have known if you’re not going to be searching the The El Capitan

⏹️ ▶️ John binaries for a UUID because you never find it because it’s all nonsense. But anyway, I

⏹️ ▶️ John Assume that I will do this in-place conversion again after making many many backups, but

⏹️ ▶️ John And both in theory and in practice, this conversion method of writing new metadata structures to free space

⏹️ ▶️ John pointing at the data in where it already exists is fairly well established and not something

⏹️ ▶️ John crazy that Apple came up with on its own. And so I’m pretty confident that this will actually work

⏹️ ▶️ John and it will seem like magic.

⏹️ ▶️ Casey That’s pretty high praise. I am very impressed.

APFS: Naming

⏹️ ▶️ Casey Why does it have the P in the name? Why not AFS?

⏹️ ▶️ John Oh yeah, so that’s a question I asked some of the file system people at WWDC,

⏹️ ▶️ John and it’s as we speculated. Like, we can’t use AFS because that’s already Andrew File System. And

⏹️ ▶️ John you can go through every letter in the alphabet. A lot of them are taken already, or every combination of two letters are either taken

⏹️ ▶️ John or nonsensical. APFS is not great as an acronym, but

⏹️ ▶️ John it’s probably better than AFS. probably better than IFS which I think is also already taken. Lots of the cool

⏹️ ▶️ John letters like ZFS and XFS are also already taken. My complaint was, why isn’t it a cool name?

⏹️ ▶️ John Why isn’t it like, forget about the four letter acronym or give it whatever four letter, three or four letter

⏹️ ▶️ John abbreviation that you want. Why doesn’t it have a cool name like, I don’t know, like Thunder FS,

⏹️ ▶️ John Swift or Grand Central Dispatch or whatever, like with the trains. Yeah, like

⏹️ ▶️ John come up with a cool marketing name for it. And the upshot seems

⏹️ ▶️ John to be that file systems are not something that most users know about.

⏹️ ▶️ John Certainly on iPhones, no one knows what their file system is. And even on Macs, people don’t know. The only time

⏹️ ▶️ John people are likely to encounter is if you’re on a Mac and you go to Disk Utility or something, you’re trying to format a new disk.

⏹️ ▶️ John Or like if you get info on a disk and you see in the Get Info window. But especially when you’re formatting your disk, it says like, oh, how

⏹️ ▶️ John do you want to what volume format do you want to use for this disk? And Disk Utility, I think, can do a bunch of different

⏹️ ▶️ John formats. Or even OS X can, OS X, whatever that operating system is called, yeah,

⏹️ ▶️ John Mac OS, can mount a lot of different things. It can mount like fat, EX fat,

⏹️ ▶️ John and all these weird, all the weird CD file systems and stuff. Sometimes if you present a user

⏹️ ▶️ John with this pop-up menu of like, what volume format do you want to make this, or what volume format

⏹️ ▶️ John is this, they don’t know what all those things mean. They don’t know what HFS plus means. They don’t know what Apple Extended Journaled

⏹️ ▶️ John whatever case, they don’t know what those things mean at all. So the goal of APFS,

⏹️ ▶️ John and it’s a reasonable goal, is to put an entry on that list that is absolutely clear that that’s the one that they want.

⏹️ ▶️ John And so if you have a pop-up menu and you’re formatting a disk and you don’t know what the hell you want to pick and you see one called Apple File System,

⏹️ ▶️ John you’re going to pick that one. Because you’re like, all right, this is an Apple. I see a logo in front of me. There’s one on

⏹️ ▶️ John the corner of the screen. I guess I want Apple File System. And you know what? That’ll be the right answer. You do want

⏹️ ▶️ John Apple File System. You do not want, you know, Apple extended journaled whatever

⏹️ ▶️ John you want Apple file system. So as boring as it may be, it makes sense from

⏹️ ▶️ John a user’s perspective that a user should never have to see this, but be when they do see it, they have no idea what all those weird names mean.

⏹️ ▶️ John They just want to pick the one that says Apple file system.

⏹️ ▶️ Marco But don’t you think there might be like a problem? Like, well, that one has a plus. This one sounds deluxe. Like the more

⏹️ ▶️ Marco words that like it sounds like that’s the deluxe cool option. Like I guess maybe I want the what whatever

⏹️ ▶️ Marco journaling means. That’s I

⏹️ ▶️ Marco, John like journaling

⏹️ ▶️ John that’s the that’s a the prosumer problem that maybe those people you know they’re they’re not worth anybody

⏹️ ▶️ John but regular people are just gonna pick the safest one they’re not gonna say maybe I don’t know what journal mean but they once may be better what about

⏹️ ▶️ John extended that’s better and what about ex fat I like that one because it’s got an XEX

⏹️ ▶️ John and I think they’re just gonna pick Apple files anyway it’s a boring name but

⏹️ ▶️ John it is certainly a straightforward name and it is a forward-looking name in that Apple feels like yes this is the Apple

⏹️ ▶️ John file system now 30 years from now when this is old and busted they might have a problem because

⏹️ ▶️ John what do you call the next one but for now Apple file system works and because they can’t use a FS

⏹️ ▶️ John APFS is the awkward abbreviation for Apple

⏹️ ▶️ Marco file system honest question do you honestly think that when this file system is outdated we

⏹️ ▶️ Marco will still even have the option to format our own partitions so they look at iOS iOS

⏹️ ▶️ Marco doesn’t give us any options you’re still gonna

⏹️ ▶️ John have to read the APS best formatted volumes I don’t know the question is whether the Mac will still

⏹️ ▶️ John be around then because you don’t have you don’t see volumes on iPads and Apple TV or the

⏹️ ▶️ John watch or iPhones, so we’ll see that

⏹️ ▶️ Marco maybe they’re out I mean at this point. I think we’re lucky that we even still have disk utility I mean barely,

⏹️ ▶️ Marco but I think we’re lucky that we have it at all I’d be surprised if if

⏹️ ▶️ Marco we still have the ability to format disks the way we want to and format partitions We want to you know

⏹️ ▶️ Marco in even ten years.

⏹️ ▶️ John Yeah Yeah, well, it could be if the file system continues to be pushed towards the geeky side of things,

⏹️ ▶️ John then they’re free to do whatever they want, like sort of like Grand Central Dispatch is the marketing name, but Lib Dispatch

⏹️ ▶️ John is the library name on disk. It’s just a clear separation between the names they present

⏹️ ▶️ John in the marketing and the actual names that programmers know them by, so it may become less important.

⏹️ ▶️ Marco Finally, when in 30 years when this file system is

⏹️ ▶️ Marco outdated and being replaced by the next file system. Can we get together when you are like,

⏹️ ▶️ Marco what, 72, 71 at that point and do a podcast

⏹️ ▶️ Marco then with a bell in

⏹️ ▶️ John it? —By then I plan to be living like a king in Patagonia.

⏹️ ▶️ Marco —They will probably have the internet there and microphones will probably still exist.

⏹️ ▶️ John —I like throwing in yet another reference that you guys don’t

⏹️ ▶️ Casey get. —I knew enough to know it was a reference, but I also knew I did not get it.

⏹️ ▶️ John —Partial

⏹️ ▶️ John, Casey credit.

⏹️ ▶️ Casey I’ll take it. All right.

APFS: Unicode filenames

⏹️ ▶️ Casey Do you want to start running through some of the features? Do you want me to prompt this? How would you like to proceed,

⏹️ ▶️ Casey sir?

⏹️ ▶️ John Well, in this part, I’m going to talk a little bit about file names. But I thought this would be a good opportunity

⏹️ ▶️ John to find out how much you guys know about Unicode.

⏹️ ▶️ Casey Not nearly enough. I’ve read the Oh, God, who’s the Spolsky thing

⏹️ ▶️ Casey on Unicode and string handling, and I’ve and I’ve already forgotten all of it.

⏹️ ▶️ John You Marco, do you come across this at all? You probably don’t, because NS string takes care of this for you. And you don’t even care what the internal

⏹️ ▶️ John representation is. You know, what about in PHP land? You deal with any of this stuff when you do web things dealing with

⏹️ ▶️ John Unicode and text and everything?

⏹️ ▶️ Marco I know a lot about Unicode. Yes.

⏹️ ▶️ John All right. Well, so this will be new to Casey and maybe new to some people, but I think we have to go over a little

⏹️ ▶️ John bit of the basics before explain what the deal is. So we’re talking about this in the context of APFS

⏹️ ▶️ John because files have names and directories have names, right? So when you talk about a file path, which usually doesn’t show up in

⏹️ ▶️ John Mac operating system or anyplace else. That entire thing is presented as a string, and certainly

⏹️ ▶️ John file names are presented in strings, and you would think this is a straightforward thing, like that nobody thinks about, oh, of course, like, files

⏹️ ▶️ John have names, like, that’s fine. But names are strings, and strings are fiendishly complicated.

⏹️ ▶️ John And it actually matters from the perspective of the file system, because it has to do a lot of stuff with strings.

⏹️ ▶️ John So the I guess the easiest way for people to understand this

⏹️ ▶️ John is that Unicode is a standard for defining all

⏹️ ▶️ John the different things that go into making a string and things in Unicode

⏹️ ▶️ John are identified by code points that have numbers and the numbers go up really high.

⏹️ ▶️ John Alright, they start low and go up really really high because there’s lots and lots of things that you can put

⏹️ ▶️ John into strings and languages in the United States. And the other thing to understand

⏹️ ▶️ John is that when you write something out to disk or store it in memory or whatever we tend to break things up into bytes

⏹️ ▶️ John and we had a lettering system in America and the sort of Western world

⏹️ ▶️ John called ASCII that defined a very small number of numbers that corresponded

⏹️ ▶️ John to things that can be in a string and each one of them fit into a single

⏹️ ▶️ John byte so all you know A through Z 1 through you know 0 through 9 all the punctuation

⏹️ ▶️ John characters or whatever they all fit into one byte so 255 possible combinations

⏹️ ▶️ John and everybody was happy until we realized that there were other characters and other languages that didn’t fit this,

⏹️ ▶️ John right? So when they came up with Unicode, they tried to be nice and give ASCII all the

⏹️ ▶️ John same numbers that it always had for compatibility reasons, but they just keep going from there. And at a certain point

⏹️ ▶️ John you get up to numbers, Unicode code points, that are way higher than 255.

⏹️ ▶️ John And when it comes time for you to write them to disk, what do you do with those? You can’t just write them out as the

⏹️ ▶️ John big numbers in a series of bytes because the first byte of the number might look like a capital letter P or something.

⏹️ ▶️ John All right, because it’s the same, you know what I mean? So they have come up with a series of encoding systems where

⏹️ ▶️ John when you got this big number that does not fit into a single byte, and you can’t write the way you would write that number out

⏹️ ▶️ John in a sequence of bytes, we need some way to encode these things. One of the ways to encode it is

⏹️ ▶️ John instead of writing one code point every byte, right, one code point every 32 bits,

⏹️ ▶️ John you have a huge space for each one. So your capital letter A, instead of taking up a single byte, takes up,

⏹️ ▶️ John what is 32? Four bytes?

⏹️ ▶️ John, Marco Eight bytes?

⏹️ ▶️ John Yeah, eight bytes.

⏹️ ▶️ John, Marco Four

⏹️ ▶️ John bytes. Yeah, there you go. It’s eight times four. I’ll get it in a second.

⏹️ ▶️ John, Marco Anyway.

⏹️ ▶️ John But that would be incredibly wasteful because now all of a sudden, if you’re writing something in ASCII text, it would

⏹️ ▶️ John take up four times as much room. That’s not good, right? There’s also, we could do it in 16. Can we do

⏹️ ▶️ John it in 16? Well, we actually have more Unicode code points fit into 16

⏹️ ▶️ John bits, but I think for a while maybe they all fit in. I think NSString uses this

⏹️ ▶️ John encoding called UTF-16 internally.

⏹️ ▶️ Marco Yeah, I believe the very first version of Unicode that they did all fit in. And then

⏹️ ▶️ Marco once we started realizing, oh, there’s other parts of the world and ancient scripts and different combining things and there’s

⏹️ ▶️ Marco all sorts of new stuff now. Don’t forget Emoji. Yeah, that all came later.

⏹️ ▶️ Casey But it’s important.

⏹️ ▶️ John Yeah, that’s not great either, but it also takes up a lot of room. And

⏹️ ▶️ John the one that’s in most common use these days is called UTF eight, which is a pretty clever encoding scheme

⏹️ ▶️ John where ASCII gets to be exactly the way it normally is. Like so, you know, one byte, one

⏹️ ▶️ John byte for each character and all the ones that are bigger. They have these unique sequences of multiple bytes.

⏹️ ▶️ John You can have one, you know, two bytes, three bytes, four bytes, I think even up to five bytes. All of these are so that all the

⏹️ ▶️ John leading bytes are not mistaken for plain old ASCII characters

⏹️ ▶️ John or whatever. And that’s the most common encoding we use. And this is relevant

⏹️ ▶️ John because when you create a file name, first of all, people don’t want to use a file system where you can only create file names in

⏹️ ▶️ John ASCII, because that would be annoying for people who speak languages other than English and even annoying for English speakers because you can have

⏹️ ▶️ John curly quotes and stuff. But you need to write this, you need to have this stuff in

⏹️ ▶️ John memory, and you needed to write it out out to disk. So you need some sort of representation. And that’s exactly what

⏹️ ▶️ John encoding is doing Unicode is they take these Unicode code points, these big numbers, possibly very large numbers,

⏹️ ▶️ John and write them out in a series of bytes in a way, according to some standard, right?

⏹️ ▶️ John The additional complication here in Unicode is that Unicode code points, I’ve been trying to say this and

⏹️ ▶️ John try to say thingy in other words, instead of saying character or letter, because it’s way more complicated

⏹️ ▶️ John than that. So a Unicode code point might be like the capital letter A and that’s something that everyone understands

⏹️ ▶️ John but a Unicode code point might also be something called combining acute accent

⏹️ ▶️ John that when this Unicode code point follows another letter it combines to make what

⏹️ ▶️ John looks like one thing on your screen so if you did the lowercase letter E plain old

⏹️ ▶️ John ASCII lowercase letter E followed by Unicode code point 301 which by the way is a number

⏹️ ▶️ John bigger than 255 called combining acute accent you would get the e with

⏹️ ▶️ John the little line pointing up and to the right on top of it the acute accent so you’re gonna write the word cafe

⏹️ ▶️ John CAFE combining acute accent you get the word cafe

⏹️ ▶️ John with the e with the little thing over it right but you can write that same thing cafe CAF

⏹️ ▶️ John and then e with a little thing over in a different way you could write CAF just like an ASCII and then you can include

⏹️ ▶️ John Unicode cone-point e9 that’s in hex Latin small letter e with acute

⏹️ ▶️ John in other words Unicode has two ways for you to write something that looks the same on the page. And,

⏹️ ▶️ John uh, I’m not sure what the original motivation is. That’s probably because you could, uh, use combining characters

⏹️ ▶️ John in more flexible ways and including every possible combination, but sometimes the combinations are convenient to include

⏹️ ▶️ John this, this comes, this adds yet another wrinkle, which is you can write the same thing in Unicode

⏹️ ▶️ John in multiple different sequences of code points. That’s a problem for something like a file system.

⏹️ ▶️ John It’s not a problem for like, if you’re, if you’re writing a report or printing a page or even making a web page. But it’s a problem for file systems

⏹️ ▶️ John because file systems need to write file names out to disk, like to store them in the metadata somewhere.

⏹️ ▶️ John And in general, file systems don’t want you to be able to have a file with the quote unquote same name

⏹️ ▶️ John in the same place, ignoring file name extensions, which just make this a mess. But if you wanted to make a file called

⏹️ ▶️ John cafe, and then you want to make another file called cafe, you wouldn’t want them both to be existing in the same

⏹️ ▶️ John folder staring you in the face and like how can these files both be here, they both have the same exact name. But if were to merely

⏹️ ▶️ John write out the encoded bytes for this thing, whether it’s UTF-32, UTF-8, UTF-16, whatever

⏹️ ▶️ John encoding format the file system chooses, if you were to just write them out like that,

⏹️ ▶️ John one application could make CAF Latin small letter E with acute, and one application can make CAFE

⏹️ ▶️ John combining acute accent. And as far as the file system is concerned, the sequence of bytes of these files is different. So when it diffs them, it says,

⏹️ ▶️ John nope, there’s no file with this name, I’ll just make this file. But when you open the folder, you’d see two files called cafe,

⏹️ ▶️ John And that is bad. And I haven’t even gotten into case sensitivity, but as you can see, in a case sensitivity,

⏹️ ▶️ John case insensitive situation, the file system has the same problem. Is there already a file with this name,

⏹️ ▶️ John yes or no? And that’s where you would factor in, okay, well, is there a file with this name, but with the

⏹️ ▶️ John capitalized letters lowercase, you know, ignoring case? But ignoring that entirely, just for plain old case

⏹️ ▶️ John sensitive whatever, there’s multiple ways you can write the same word, and Unicode handles that by

⏹️ ▶️ John a system that calls, what is it called, normalization? different normalized forms of Unicode. It has a bunch

⏹️ ▶️ John of different forms. We’ll link to them in the show note. What it basically comes down to is

⏹️ ▶️ John should we try to break apart every character into its smallest possible pieces, decompose them,

⏹️ ▶️ John you want to have the lowercase e and then the the combining accent, or should we try to compose them all

⏹️ ▶️ John into their canonical form by squishing the e with the combining accent into the Latin small letter e with the cute,

⏹️ ▶️ John or should we do different combinations of them in different orders, a whole bunch of different normalized forms. The file system,

⏹️ ▶️ John if it wants to… well, I was gonna say the file system has to pick one of these, but it doesn’t. It could

⏹️ ▶️ John just, you know, do what I said and say, I will accept whatever bytes you give me and I have no idea what

⏹️ ▶️ John they mean, and I will just put them in the file system. Or it could pick one of these normalized forms. You pick a normalized form,

⏹️ ▶️ John then it doesn’t matter what the application gives you, it will always be canonicalized when it goes to the file

⏹️ ▶️ John system. So this is what HFS Plus does, but of course HFS Plus, being a very old file system, does not use

⏹️ ▶️ John any of the Unicode normalized forms. It uses a variant of those normalized forms that has changed in different versions

⏹️ ▶️ John of the operating system. If you look at tech note 1150 from Apple, which apparently is not

⏹️ ▶️ John no longer online, so we’ll link to like another version of it. It uses kind of

⏹️ ▶️ John decomposed form except for a whole bunch of ranges that it changes to be compatible with like the Mac Greek encoding

⏹️ ▶️ John and it excludes a bunch of things. It’s really, really confusing. But anyway, in

⏹️ ▶️ John HFS plus, if you try to make two files called cafe, and I wrote, of course, a little Perl script to test this out.

⏹️ ▶️ John And you try those two different ways to do it, the E with a combining accent and then the E with a little hat

⏹️ ▶️ John already combined. Doesn’t matter what you write. The HFS Plus will be like, oh, yep, I totally made that file

⏹️ ▶️ John for you. But when you read that file back in and look at the file name, it will have changed

⏹️ ▶️ John it to the normalized form that HFS Plus wants. So you will not make two separate files. If

⏹️ ▶️ John you try to make one file with that one name and one file with the other name, you will just overwrite the same file twice.

⏹️ ▶️ John You like h of s plus does not take your file name for what it is

⏹️ ▶️ John It does something to it first or rather the driver for h of s plus does to be compliant with the file system, right? This is

⏹️ ▶️ John one of the reasons that a lot of Unix nerds have and I think Linus Torvalds had a big rant about it But I couldn’t find on the internet

⏹️ ▶️ John have complained about h of s plus I gave you what the file name is It’s supposed to be this

⏹️ ▶️ John and you said no I totally made that file for you and then later when I try to read that file and look at its name it’s not

⏹️ ▶️ John what I told you. And that’s bad for things like Git, or you know Linux for that matter, or other things that deal with

⏹️ ▶️ John files. They don’t expect that to happen. The sort of Unix style approach is, here’s a sequence of bytes

⏹️ ▶️ John that makes up this file’s name. Make that file. And then later when I read that file,

⏹️ ▶️ John I’m going to look at its name, and it better be that. And I better be able to look it up with that name, and the name that comes on the

⏹️ ▶️ John disk better be with the bytes that I gave you, and if it doesn’t, your file system is broken. But as I hopefully explained

⏹️ ▶️ John if you do that it’s very easy for applications to make names that look exactly the same

⏹️ ▶️ John in the user interface cafe and cafe like they are indistinguishable they’re exactly the same pixel

⏹️ ▶️ John for pixel because you can’t see the bytes that make up the Unicode string you shouldn’t care about them and to users

⏹️ ▶️ John to be able to make the same file with the same name with apparently the same name that seems like a bug it seems like something

⏹️ ▶️ John shouldn’t happen which cafe file is the one that I want they both have the same name I don’t know So you

⏹️ ▶️ John can’t do that, and that’s why Apple doesn’t do that. APFS now has to make some

⏹️ ▶️ John decisions about this. What should APFS do for file names? Should it do it the HFS Plus way exactly? Probably

⏹️ ▶️ John not, because that’s got years and years of baggage floating around in it from the fact that it existed before Unicode,

⏹️ ▶️ John I think, or even when, or maybe just it came out when Unicode was very young. I think HFS Plus, HFS predates Unicode, but

⏹️ ▶️ John not HFS Plus. But anyway, doing what HFS Plus does would be

⏹️ ▶️ John best for compatibility, but it would also be slightly crazy. think Apple wants to maintain their

⏹️ ▶️ John weird mapping tables forever with their different you know non-standard

⏹️ ▶️ John normalized forms should it do what many Linux file systems do and many

⏹️ ▶️ John other file systems do which is I take no position on file names that’s not my concern you gave me a sequence of

⏹️ ▶️ John bytes and that’s the file name I will write that out to the file system later when you read it you will get back that sequence of bytes I

⏹️ ▶️ John don’t know what the sequence of bytes means could mean anything I have no idea what that stuff but anyway here’s what you said.

⏹️ ▶️ John My understanding is that that’s what APFS does today so if you run my little Perl script in APFS

⏹️ ▶️ John you can make two files that apparently are both named cafe but they’re actually different sequences

⏹️ ▶️ John of bytes because one uses the combining acute accent and the other one uses the e with a little accent already composed on it.

⏹️ ▶️ John And by the way I think I think it might be well anyway when you if you were to

⏹️ ▶️ John do this when I did it in Perl I had to choose how to encode the file name and I chose UTF-8.

⏹️ ▶️ John I think you can’t the API well I’m using Perl so it has different rules about that but the bottom

⏹️ ▶️ John line is when you somebody has to pick the byte representation all right you can’t give it a sequence of bytes

⏹️ ▶️ John for your string and use Unicode code point E9 or Unicode code point 301 because they don’t fit in a

⏹️ ▶️ John byte so you have to figure out a way to fit them in a byte and that’s what Unicode encodings are and so I picked UTF-8 if I had given

⏹️ ▶️ John it UTF-16 I’m assuming it have stored utf-16 and give me back utf-16 and so on for

⏹️ ▶️ John all the other different encoding things so as far as i can tell apfs is a bag of bytes that means it’s case sensitive

⏹️ ▶️ John because bag of bytes means you gave me these bytes here are these bytes and so if another file comes along and wants to write cafe

⏹️ ▶️ John with a lowercase c instead of a capital c it’s a different sequence of bytes file system says nope no file with that name

⏹️ ▶️ John here you go um i think this will i don’t know

⏹️ ▶️ John i don’t think this is the official position because apfs is not done you can’t even boot from it right now.

⏹️ ▶️ John It’s obviously not finished. This is what I’m going to be looking for in the coming months

⏹️ ▶️ John and in the rest of 2017. What will be the policy in APFS? Will they add case insensitivity?

⏹️ ▶️ John Once you decide to add case insensitivity, you have to make some decisions. At the very least, you have to make decisions

⏹️ ▶️ John about encoding, because you can’t compare case on two things. Like, if someone writes one file name in UTF-16 and one

⏹️ ▶️ John file name in UTF-8, something in the system has to understand

⏹️ ▶️ John that, you can’t do case comparisons with strings encoded in different ways. Or maybe you just

⏹️ ▶️ John straightforwardly compare it as if, I don’t know, it doesn’t make sense. Case comparisons, you have to have an awareness

⏹️ ▶️ John of the encoding, right? So if they ever want to make a case insensitive version of APFS,

⏹️ ▶️ John they need to decide, here’s how file names are encoded. And then beyond that,

⏹️ ▶️ John they have to make a decision about normalization. Do we do normalization at all? If you don’t do normalization, you

⏹️ ▶️ John can get into these weird situations where you have files with apparently the same name. You could handle this all at the framework

⏹️ ▶️ John level, handle it all in Cocoa, handle it all in, you know, whatever higher level frameworks, UIKit,

⏹️ ▶️ John so on and so forth. I don’t know if that’s something that Apple will do. And it’s

⏹️ ▶️ John not, I think this is not a minor decision, because what you decide to do here

⏹️ ▶️ John has implications that ripple down through history, as we saw with the HFS plus one. After you make this decision, it’s not easy

⏹️ ▶️ John to change it. I suppose you could go with case-sensitive now and do case-insensitive

⏹️ ▶️ John later, but kind of like the opposite of what HFS did, going with case-insensitive from the beginning and making a case-sensitive

⏹️ ▶️ John variant. But anyway, this I think is one of, to me, the most important decisions that

⏹️ ▶️ John has not yet been made for APFS. And since it will probably happen before

⏹️ ▶️ John next year’s WREC, the only way we’ll be able to tell whether it’s made is looking at the APFS

⏹️ ▶️ John documentation online and continuing to run tiny test programs from languages other than Apple’s frameworks

⏹️ ▶️ John to see what does the file system except I don’t know what the right position I don’t have a particular

⏹️ ▶️ John position here only I just want them to make a decision and for it to be a reasonable

⏹️ ▶️ John one.

⏹️ ▶️ Casey And that’s what I was about to ask is that you don’t necessarily have a preference

⏹️ ▶️ Casey between normalized and not normalized. You just want to see that there is a a

⏹️ ▶️ Casey declared statement as to how it’s going to work.

⏹️ ▶️ John Yeah, and like a philosophical, some sort of philosophical statement of support, because in many

⏹️ ▶️ John respects, making the file system itself just be bag of bytes, which is you gave me bytes, I stored

⏹️ ▶️ John them, is very straightforward. It’s the easiest to test, it is the simplest,

⏹️ ▶️ John it’s the fastest, you can do really fast comparisons with it. It’s everything you

⏹️ ▶️ John want from a performance perspective, if you were like, say, on the file system team and you wanted to make a really fast file system.

⏹️ ▶️ John But from a user’s perspective, if it doesn’t doesn’t solve user problems it doesn’t solve user problems that

⏹️ ▶️ John that hfs plus does I don’t know if you can solve it all at the framework level

⏹️ ▶️ John I have to think that you can’t really at the framework level entirely because

⏹️ ▶️ John I don’t like there’s so many different ways and so many different programs that can write files to disk

⏹️ ▶️ John and not all of them go through your frameworks right and it’s it’s madness

⏹️ ▶️ John to have some file names in utf-16 some file names in UTF-8, some normalized,

⏹️ ▶️ John some denormalized, some a weird mix. That seems untenable to me from just the basic perspective

⏹️ ▶️ John of how do I correctly check whether a file exists? The file name blah, how do I

⏹️ ▶️ John check whether that exists? It’s like well how many different ways can you write this? You know the only

⏹️ ▶️ John way I can check for this because the file system knows nothing about encoding I say hey file system do you have a file name cafe?

⏹️ ▶️ John And the file system is like all right well what sequence of bytes do you want me to check for? Nope I don’t have one of that sequence of bytes.

⏹️ ▶️ John Hmm, you think is there another way that I could write Gaffe? I could write it in UTF-32 with a combining

⏹️ ▶️ John accent? Do you have this sequence of bytes? And the file system would say, nope, don’t have that sequence of bytes. And the application’s like, hmm, maybe

⏹️ ▶️ John they did it in UTF-16, but with the Latin small letter E with a Q on it. And like, that’s untenable, right?

⏹️ ▶️ John And so as scary as it is for the file system to make decisions about this, because they’re just, they’re very difficult

⏹️ ▶️ John to change after the fact, if it doesn’t, then every file system API

⏹️ ▶️ John needs to either be incredibly thorough or make it so that you can potentially create

⏹️ ▶️ John files with apparently the same name from the user’s perspective. And I think that’s not healthy either. So I don’t know.

⏹️ ▶️ John It could be that we don’t care about any APIs except for the Cocoa ones because we don’t care what people going straight through the C APIs

⏹️ ▶️ John or writing Python scripts or whatever we’re doing. All we care about is the things that go through app kit

⏹️ ▶️ John and UI kit. And then you could have it be bag of bytes. I don’t know. But

⏹️ ▶️ John I can hear arguments on either side of it. But the fact that nothing has been said so far and the fact that it just does bag

⏹️ ▶️ John of bytes now doesn’t really tell me anything about their position position. It just tells me they didn’t get to

⏹️ ▶️ John that part yet. Because I think if they if they had got to it, it would have been in that presentation, like the presentation of sort of developers

⏹️ ▶️ John to say, test your application on a PFS. They did tell them to test it based

⏹️ ▶️ John on the case sensitivity part. And that makes some sense because the Apple has shipped a case sensitive file system for a long time.

⏹️ ▶️ John And a lot of applications break on case sensitive file like Adobe’s applications and Microsoft’s

⏹️ ▶️ John applications, all apps that were created in the days before the Mac could even have a case sensitive file system.

⏹️ ▶️ John Many of them, believe it or not, attempt to open files or libraries and stuff by file paths that do not

⏹️ ▶️ John match the case of the actual files on disk. You would think that those are the type of bugs that wouldn’t have been fixed long ago. But

⏹️ ▶️ John the bottom line is Mac users don’t run case sensitive file systems for the most part. iOS luckily has always had case

⏹️ ▶️ John sensitive file systems. But anyway, case sensitive is separate from whether things are normalized.

⏹️ ▶️ John So I think if Apple was committed to bag of bytes file naming,

⏹️ ▶️ John they would have maybe mentioned that in the file system session, but I’m not sure. But anyway, it’s it’s a thing

⏹️ ▶️ John I think about a lot and I have concerns.

⏹️ ▶️ Marco Our final sponsor this week is Backblaze.

⏹️ ▶️ Marco Backblaze is unlimited native online backup for Mac and PC. Go to backblaze.com

⏹️ ▶️ Marco slash ATP for a no credit card required, no risk, free 15 day trial.

⏹️ ▶️ Marco Now Backblaze is the best online backup. That isn’t in the script. I don’t know if they can say that, but I can

⏹️ ▶️ Marco say that because I’ve used many online backup programs and services. And Backblaze

⏹️ ▶️ Marco is by far the best one that’s worked out for me and my favorite one. And I’ve been using them since long

⏹️ ▶️ Marco before they were a sponsor. I have them backing up my computer, Tiff’s computer, even my mom’s computer.

⏹️ ▶️ Marco I have them backing up something like 6 terabytes of my data. It is so solid.

⏹️ ▶️ Marco It has literally never caused us a problem for the, I don’t know, 4 or 5 years I’ve been using them.

⏹️ ▶️ Marco I highly recommend having online backup, especially if you’re going to be playing around with the file system on your computer.

⏹️ ▶️ Marco You should really have online backup to save your butt if you do anything stupid or

⏹️ ▶️ Marco if your local backups are all corrupted by bit rot from HFS plus. So check out backblaze

⏹️ ▶️ Marco today backblaze.com slash ATP for that for that risk free 15 day free

⏹️ ▶️ Marco trial back places a bunch of other cool features too. So for instance, you can of course restore your

⏹️ ▶️ Marco files on the web, you can download files as a big zip file, you can even have them overnight you

⏹️ ▶️ Marco a hard drive with all your data if it’s like a ton of data and it would be take too long to download. And if you return

⏹️ ▶️ Marco the hard drive back to them within 30 days, you get a refund on the price of the hard drive. So it’s a very effective

⏹️ ▶️ Marco way to restore a ton of data at once. And not only that, on the other side, if you want to just restore

⏹️ ▶️ Marco one file, you can do it in the web interface or their iOS or Android apps.

⏹️ ▶️ Marco So one great use for this is if you’re on your phone, or let’s say you’re traveling and you only have your laptop or something, and

⏹️ ▶️ Marco you want to file from your home computer, you can do this with backblaze because backblaze has all your files, it’s

⏹️ ▶️ Marco just automatically they’re running backing up everything you don’t have to think about it, which by the way is how backup should work.

⏹️ ▶️ Marco If you think about it, it’s It’s not really a backup. So with Backlash, you can log in and just restore one file if you forgot

⏹️ ▶️ Marco a file on a trip or something like that. It is incredibly convenient. I’ve done this probably five or six times now because

⏹️ ▶️ Marco it doesn’t happen often, but when it does happen, I’m so glad I have it. And that’s

⏹️ ▶️ Marco kind of the story of online backup. You won’t usually need to use it, but you’ll be really

⏹️ ▶️ Marco glad you’ve had it on those few times that you do use it. Just five bucks a month

⏹️ ▶️ Marco for unlimited space, unthrottled speeds per computer. So five bucks per month per

⏹️ ▶️ Marco computer, it is a great deal, unlimited space. So check it out today. Go to backblaze.com

⏹️ ▶️ Marco slash ATP and our listeners get a 15 day free trial.

APFS: Unicode filenames, cont’d.

⏹️ ▶️ Marco backblaze.com slash ATP. Thanks a lot.

⏹️ ▶️ John Do you guys have any opinions on, before we move on, case sensitivity? Wow.

⏹️ ▶️ Marco So for the two questions that you’ve basically posed here, I think

⏹️ ▶️ Marco the file system should, you know, on the topic of like, you know, bag of bytes versus at least normalizing,

⏹️ ▶️ Marco you know, normalizing to some degree, whether you normalize you normalize for like Unicode normal forms for combining characters

⏹️ ▶️ Marco and things or at least where you normalize for character encoding. I think encoding is an

⏹️ ▶️ Marco obvious win like you should absolutely normalize for character encoding you know and whatever

⏹️ ▶️ Marco encoding they pick it I’m less concerned about that you know if you pick UTF-8

⏹️ ▶️ Marco cool you know I whatever whatever encoding they want to pick that’s fine but I think it is a clear

⏹️ ▶️ Marco win that the that the encoding should be normalized for file system access.

⏹️ ▶️ Marco Whether or not, you know, normalizing forms of characters and case

⏹️ ▶️ Marco sensitivity, I mean, that’s kind of two different degrees of the same kind of thing. That

⏹️ ▶️ Marco I am less confident on, but my inclination would be,

⏹️ ▶️ Marco I think the era of case insensitivity is probably over,

⏹️ ▶️ Marco but I would say that if I were designing a system, and having heard no other

⏹️ ▶️ Marco arguments except for your large amount of them.

⏹️ ▶️ Marco I think the Cocoa APIs should be responsible for normalizing or not

⏹️ ▶️ Marco normalizing the name, like case and combining characters wise.

⏹️ ▶️ Marco That the low-level C APIs and the raw interface of the file system

⏹️ ▶️ Marco should not perform that kind of messing with characters.

⏹️ ▶️ John the file system throw a fatal error if the if the encoding is incorrect, like if you

⏹️ ▶️ John give an invalid UTF eight sequence, should that just not work? Or should dutifully write those to disk

⏹️ ▶️ John anyway?

⏹️ ▶️ Marco Yeah, because like, if it doesn’t do that, then you have the same problems before where like, you can write a name,

⏹️ ▶️ Marco and then you go to you got to read that name, and it’s not there. So I can see why that could be a problem.

⏹️ ▶️ Marco So So maybe Yes, maybe the answer is that if you try to have an invalid, you know, sequence

⏹️ ▶️ Marco in the name that you give it that maybe it does error out. I don’t know.

⏹️ ▶️ John All right, because it’s basically a question of like, so say they pick UTF eight, which is exactly what they’d pick, because that’s what everyone picks, because all

⏹️ ▶️ John the other encodings are stupid. Or if they pick that,

⏹️ ▶️ John then they could just say, oh, when you write your file names, Dave, EFs, you should

⏹️ ▶️ John write them in UTF eight. And then if you don’t do it, like that’s your own stupid default, right?

⏹️ ▶️ John In other words, that they have, it is a policy, implementation enforcement and so then

⏹️ ▶️ John later when some other program tries to like in other words if you wrote it with an invalid utf-8 sequence and then later

⏹️ ▶️ John your program tried to read it with that same invalid utf-8 sequence it would find it right but it would be garbage and then any

⏹️ ▶️ John other any other API that’s trying to like compare files or whatever would just you know

⏹️ ▶️ John fail in weird ways because you have not even written a valid utf-8 sequence and when it does it like if you tried to say is there

⏹️ ▶️ John a file called cafe it would be like nope don’t see it because you just wrote an invalid sequence the other one is to have the

⏹️ ▶️ John the file system enforced at the driver level and in all the various operating

⏹️ ▶️ John systems, if you pass some file name down through the system, down through the BSD layers, at some

⏹️ ▶️ John point we’re gonna look over your entire sequence and confirm, is this a valid UTF-8 sequence? And if it’s not,

⏹️ ▶️ John boom, everything blows up, forget it, nothing works, or whatever. And I don’t know, I think checking

⏹️ ▶️ John that it’s valid UTF-8 shouldn’t take long and should

⏹️ ▶️ John be fairly straightforward because we’re not dealing with large volumes of data, So that’s what I think they would go with if they make that decision

⏹️ ▶️ John that will essentially be impossible to write an invalid UTF-8 sequence to a file name,

⏹️ ▶️ John which could also break applications. Like everything, every decision they make aside from make it work exactly like HFS plus is going

⏹️ ▶️ John to break applications and make it work exactly like HFS plus is the one decision that I think is entirely

⏹️ ▶️ John wrong.

⏹️ ▶️ Marco I mean, heck, in this day and age, it wouldn’t be unreasonable to have like an actual Intel hardware instruction

⏹️ ▶️ Marco to validate a sequence of UTF-8 characters.

⏹️ ▶️ John Yeah, because the rules aren’t that complicated. Like,

⏹️ ▶️ John, Marco you know,

⏹️ ▶️ John I mean, you don’t even have you don’t even have to see whether where it corresponds to like a code, a defined code point in the Unicode standard. You can’t,

⏹️ ▶️ John you know, that’s more difficult to do, certainly in hardware.

⏹️ ▶️ John, Marco It’s just like bit masking. Yeah. But

⏹️ ▶️ John anyway, I suspect that’s probably what they’ll end up being doing. But case sensitivity, I think, is still

⏹️ ▶️ John a weird issue, because I think users of computers, even

⏹️ ▶️ John just on iOS, where you’re not really ever prompted to write a file name, but certainly on the Mac,

⏹️ ▶️ John having two files called like, I don’t know, like, you know, XMS, and then you

⏹️ ▶️ John write a file called XMS, but you decide to capitalize the X and you end up with two files, you know, you thought you were overriding

⏹️ ▶️ John the previous one. I don’t know that just case insensitivity, I think is more like how how regular people

⏹️ ▶️ John think computers work. case sensitivity is how programmers think computers work for sure. So I’m sure

⏹️ ▶️ John that it will be very popular with programmers. But I mean, maybe not because like I said, The fact

⏹️ ▶️ John that there are still Mac applications that only work on case-insensitive file systems due to eminently

⏹️ ▶️ John fixable errors in their source code, it’s like that’s never

⏹️ ▶️ John going to change unless you force it to change. So I don’t know. Casey, you have opinions?

⏹️ ▶️ Casey Yeah, you know, I think that of all the places you want to have the Wild West, the

⏹️ ▶️ Casey file system is not it. And I think that forcing and encoding makes sense.

⏹️ ▶️ Casey I agree with you mostly that

⏹️ ▶️ Casey users expect case insensitivity, but I don’t know. I think

⏹️ ▶️ Casey what with the internet running on Unix for the most part and URLs generally being case sensitive,

⏹️ ▶️ Casey I think that people are starting to become more comfortable with that.

⏹️ ▶️ Casey So if I had a vote, I would say enforce something like UTF-8.

⏹️ ▶️ Casey If you write something that’s garbage or not UTF-8 encoded, barf all

⏹️ ▶️ Casey over it and do so violently. And I would favor case sensitivity

⏹️ ▶️ Casey by default. That’s my vote.

⏹️ ▶️ John I don’t think anyone’s in favor of normalization, but like I said, you end up in weird situations. So I would

⏹️ ▶️ John suspect if they don’t touch normalization of the file system level, then they do it all at the API level. But

⏹️ ▶️ John then it’s just, it’s pretty easy to make fairly absurd situations where I bet you could

⏹️ ▶️ John pick some word or phrase that has a huge number of possible combinations, like

⏹️ ▶️ John in some languages that uses accents on its characters. And just, you know, the, the number of possible

⏹️ ▶️ John permutations goes up pretty quickly as you add the accent of characters, then you can make a folder full of literally hundreds

⏹️ ▶️ John of files that all appear to have exactly the same name in the finder. That’s something you can’t do in H of S plus.

⏹️ ▶️ John And it’s something that you shouldn’t be able to do, but we’ll be able to do with the

⏹️ ▶️ John file system, even if they pick an encoding, but they don’t pick a normalization. So that’s

⏹️ ▶️ John that’s what I’m thinking about. That’s because it’s not it’s not a straightforward problem. It’s not an obvious answer, though. Clearly, Apple will do this. There is

⏹️ ▶️ John no easy answer.

⏹️ ▶️ Marco I never thought we’d talk this long about the detailed implementations of dealing with

⏹️ ▶️ Marco file names in a file

⏹️ ▶️ Marco, John system. Well,

⏹️ ▶️ John it’s more just about Unicode. And even though, right, you know, you and I both know it having dealt with it. But I can tell you the

⏹️ ▶️ John first five times I dealt with Unicode, I still didn’t understand it only on the sixth time. You know, like,

⏹️ ▶️ John I if you have never heard anything about Unicode, hearing what I just said about all the code points normalization is

⏹️ ▶️ John not going to sink in. But it’s it’s weird. And if you don’t know about it, and think about

⏹️ ▶️ John it, when dealing with strings, you will eventually be sad.

⏹️ ▶️ Marco No, I mean, dealing with Unicode, in my experience is a lot like parenting, in that

⏹️ ▶️ Marco the more you the more experience you get in the area, the more you realize that you will just

⏹️ ▶️ Marco always be wrong. And there are no good answers.

⏹️ ▶️ John Wow. Well, I mean, there’s one thing is just understanding the standards, which I think is difficult enough. And then Once you understand

⏹️ ▶️ John the standards, then you realize, no matter what I pick, there are downsides. And then the real

⏹️ ▶️ John sad realization is that you realize all the places, if you’re a programmer, you realize all the places in

⏹️ ▶️ John your program where text is coming in without the encoding specified. And you’re like, I don’t even know what the

⏹️ ▶️ John right thing to do is. How is this going to be encoded? This is just bytes coming across some sort of,

⏹️ ▶️ John you know, reading out of a disk, a file on disk or coming across the network. And then you end up

⏹️ ▶️ John like rippling outwards and asking all of your input sources, hey, you tell me what encoding you’re using?” and they say,

⏹️ ▶️ John encoding? What? And then you realize you basically have an intractable problem where, especially if you’re accepting

⏹️ ▶️ John input from the wider world, that it doesn’t come tagged along with what encoding it is,

⏹️ ▶️ John and you don’t know how to make sense of it unless you know the encoding. So you kind of have to, I can detect the encoding through a bunch of heuristics,

⏹️ ▶️ John or I can guess, or I can try to enforce downstream that you should only give it to me in this encoding, and then you find out how many people

⏹️ ▶️ John are actually sending you things that don’t conform, or all of a sudden you get a bunch of UTF-16, or

⏹️ ▶️ John shift gist and you have this is terrible text is terrible we should communicate through a series of ones and

⏹️ ▶️ John zeros it’d be much easier.

⏹️ ▶️ Marco Wow. I never I’ve never heard shift gist discussed in a podcast I

⏹️ ▶️ Marco don’t think I ever will again.

⏹️ ▶️ John Oh

⏹️ ▶️ John, Marco yeah

⏹️ ▶️ John but my first big normalization project was like I would have killed for UTF-8

⏹️ ▶️ John like that’s that’s how bad it was before Unicode things were worse let’s just

⏹️ ▶️ Marco put it that way. I mean, I mostly avoided having to deal with these issues over

⏹️ ▶️ Marco time with my various web services. You’re right that in native apps, you don’t

⏹️ ▶️ Marco generally have to deal too much with this. On the web, you’ve got to deal with it all the time. The

⏹️ ▶️ Marco easiest thing on the web is to just make all your pages serve UTF-8, serve themselves as UTF-8,

⏹️ ▶️ Marco which will then make all browsers default to sending you UTF-8 in forms. Only

⏹️ ▶️ John the modern world will make your browsers do that. the old world you could do whatever you wanted with your web pages but

⏹️ ▶️ John people in Japan were gonna be sending you things in a different encoding and it doesn’t matter if you put UTF-8 as

⏹️ ▶️ John the, you know, like that was great when browse just finally started to honor the encoding of the page.

⏹️ ▶️ Marco Yeah, but that I mean that was most of my career has been after that. I mean like when we were building Tumblr in 2006 that

⏹️ ▶️ Marco was the case so it’s been okay but and then the only time I really had to think

⏹️ ▶️ Marco about Unicode on web stuff recently is when emoji

⏹️ ▶️ Marco came around then I had to alter all my MySQL tables to be the UTF-8 MB4

⏹️ ▶️ Marco character set, which is annoying. But after that, it was fine.

⏹️ ▶️ John MySQL made some Unicode mistakes. The best part of the web and encoding is,

⏹️ ▶️ John even coming to the case sensitivities, the old URL that we increasingly don’t see in our address

⏹️ ▶️ John bar. The hostname part is not case sensitive, whatever you think that means. The encoding for URLs

⏹️ ▶️ John is I think it is basically limited to ASCII, but there’s this way you can encode

⏹️ ▶️ John Unicode code points with this special escape sequencing thing.

⏹️ ▶️ John, Marco It’s

⏹️ ▶️ Marco not- In the host name there is. In the rest of the URL, God knows what’s going on there. In

⏹️ ▶️ Marco, John the

⏹️ ▶️ Marco, Casey host name,

⏹️ ▶️ Marco, John there’s a

⏹️ ▶️ Marco special international domain extension thing, but that has a bunch of security problems, actually.

⏹️ ▶️ John Right, of course, and then like you said, for the rest of the URL, it’s kind of the wild west,

⏹️ ▶️ John and it’s the worst that half of the string is case sensitive and half of it isn’t. Again,

⏹️ ▶️ John, Marco whatever the hell you think case-sensitive means,

⏹️ ▶️ John URLs are still a problem. In that respect, the body of your HTML

⏹️ ▶️ John documents is blessedly normalized at this point. Because like you said, most modern browsers honor the encoding.

⏹️ ▶️ John Anyway, text is hard.

⏹️ ▶️ Casey All

⏹️ ▶️ Casey, John right, well.

APFS: (Lack of) data integrity

⏹️ ▶️ Casey Speaking of things that are hard, how about data integrity?

⏹️ ▶️ John Yeah, I’ve touched on this last time. This is the one feature I wanted from APFS that it did not deliver.

⏹️ ▶️ John And we’re speculating about why that might be or if it really was the case. Data integrity,

⏹️ ▶️ John just a quick review, is basically the idea that if a program writes

⏹️ ▶️ John some data to disk, later it reads that data back, the data that it originally wrote should

⏹️ ▶️ John be the data that it gets back. that could be a day later, a year later,

⏹️ ▶️ John ten years later. If that’s not the case because hardware is fallible, you would like

⏹️ ▶️ John the file system to know, hey you asked for these bytes but it turns

⏹️ ▶️ John out that since the last time since they were written they have been screwed up. So

⏹️ ▶️ John restore from backup or solve this problem another way. ZFS and some other file systems have a way to make

⏹️ ▶️ John multiple redundant copies of the same data and when they detect this error they can fix it

⏹️ ▶️ John because they have a good copy and a bad copy and they will just ignore the bad copy and use the good copy and write another copy

⏹️ ▶️ John of the good one or whatever. All that stuff is expensive because it means you have to write out checksums

⏹️ ▶️ John with all of your data and you have to sort of do end to end checksum so that everyone is sure yep I’m sending

⏹️ ▶️ John you this and here’s the checksum yep I got that and here’s the checksum I’m writing to disk making sure everything is all good you know

⏹️ ▶️ John that’s what that’s a feature of ZFS it’s a very important and wonderful feature of ZFS but

⏹️ ▶️ John it has a cost associated with it both in terms of computation of computing all those checksums and and also

⏹️ ▶️ John dependency, because you can’t really be done writing the data until you’ve also made sure all the data was written correctly

⏹️ ▶️ John and then also read written the checksum and made sure that you know, so there’s lots of dependencies that are inherent in that.

⏹️ ▶️ John And mostly I wanted it for the case of bit rot as an over long periods of time, I write out my

⏹️ ▶️ John family’s photos to disk, and I make backups of that disk, and I push that up to backblaze, and

⏹️ ▶️ John it’s on time machine, and I’m doing super duper clones. And if some of those bits are flipped because of

⏹️ ▶️ John errors in the hardware, all I’m doing is copying those now corrupted bits over and over and over again, over the

⏹️ ▶️ John years to different backups, you know, the incremental backups don’t go on forever, there’s a window of time, you know, before

⏹️ ▶️ John they’ll fit on the disk. And the same thing with backblaze, they don’t think they keep your backups forever and ever and ever every

⏹️ ▶️ John different version. So eventually, you will just have be propagating the corruption

⏹️ ▶️ John over the years until 20 years later, when you try to look at some baby picture, your favorite baby picture is all messed up, and all scrambled

⏹️ ▶️ John and jpeg because a bunch of bits flipped and you don’t know which ones. That’s what I’m trying to protect against. And

⏹️ ▶️ John APFS does not provide that feature. It does not have the option to write out checksums with

⏹️ ▶️ John all of its file data. Now APFS is a flexible file system intentionally designed to be extensible

⏹️ ▶️ John without breaking backwards compatibility. This feature could conceivably be added

⏹️ ▶️ John in the future. Hell, it could be added before 2017. They got a whole year, right? But it’s not there right now and that’s a

⏹️ ▶️ John little bit disappointing. It does write checksums for metadata. Metadata is like,

⏹️ ▶️ John here’s what the file’s name is, here’s all the dates, the ownership, the permissions, and here’s where each individual piece of this file

⏹️ ▶️ John data is on the disk so you know where to go get it. I keep saying disk, but you know what I mean.

⏹️ ▶️ John And it writes checksums for metadata. I mean, there’s good reasons to do that, mostly because a

⏹️ ▶️ John lot this file system, like many others, tries to doesn’t do always

⏹️ ▶️ John consistent on disk representation, but it comes close. It wants to know, did I finish

⏹️ ▶️ John this operation? So if someone yanks out the plug to your computer, and it was in the middle of writing a file, what you want to happen

⏹️ ▶️ John when you reboot is either the entire file is there or none of the file is there. And And checksums on metadata

⏹️ ▶️ John could conceivably help with that, because what you’re going to do is you’re going to write the data and write the metadata.

⏹️ ▶️ John But if the metadata is not complete, like if we’re not, we didn’t finish, if we’re not done, then you just ignore all

⏹️ ▶️ John of that. It’s all invalid, and the file basically isn’t there. You have a consistent file system. Don’t worry about that partially written

⏹️ ▶️ John file or that partially written metadata or whatever. And how do you tell whether the metadata is partially written? Well, if all your metadata has

⏹️ ▶️ John checksums, when you’re looking at that piece of metadata, you can look at it and do the checksum and say,

⏹️ ▶️ John does the checksum match the metadata? if it does, it’s complete. And if it doesn’t, you know, you can do that with journaling and stuff. But

⏹️ ▶️ John APFS is not used journaling. Journaling is relatively expensive way to do the same thing where you sort of

⏹️ ▶️ John write out what you plan to do, then you do it, then you write out that you’re done. And that’s how it can tell whether you finished an operation or not.

⏹️ ▶️ John And metadata is in general more important than data. Because if you screw up the metadata, you could hose your

⏹️ ▶️ John entire disk, like screwing up the metadata for an entire directory, and then everything under it becomes invisible or whatever.

⏹️ ▶️ John And metadata is really small. And that checksums are really fast on it. So there’s no real performance hit. So APFS

⏹️ ▶️ John does do checksums with metadata does not do it with data. This is a topic that came up a lot at WWDC

⏹️ ▶️ John talking to other people who had up the other file system nerds who had also

⏹️ ▶️ John talked to people on the file system team about APFS. And I think I

⏹️ ▶️ John think this was a very popular feature suggestion,

⏹️ ▶️ John hey, it would be great if APFS had data integrity or the question, why doesn’t APFS have data Integrity and a lot

⏹️ ▶️ John of the the reasons are you know things that I said that is potentially expensive One of the

⏹️ ▶️ John reasons that I heard was that SSDs do their own checksumming internally

⏹️ ▶️ John which is true. They do because they have to deal with the Pretty startlingly

⏹️ ▶️ John unreliable nature of flash storage you know all sort of memory

⏹️ ▶️ John chips are Inherently at the bottom tiny little analog devices that are a little bit scary

⏹️ ▶️ John when you find out how they work and SSDs are no different. SSDs have over

⏹️ ▶️ John provisioning where they have many more memory cells than they’re going to use and it’s actually a tiny little operating system in there that’s trying

⏹️ ▶️ John to distribute all this stuff to all the storage and they wear out if you use them too much. It’s very very complicated

⏹️ ▶️ John but they do copious not just checksums but also error correcting code so if they find a bit flip because

⏹️ ▶️ John of some analog error or whatever they can fix it before returning the data so that’s one of their

⏹️ ▶️ John stopgaps is like hey spinning this don’t do that so much, but they’re going out and SSDs do

⏹️ ▶️ John the checksums. That’s not really a great solution because there are other places there can be errors can be introduced like

⏹️ ▶️ John other hardware other than the SSD itself like the bus and everything. But since Apple controls that entire chain,

⏹️ ▶️ John they control the bus, they control the SSDs in theory, especially since they make Macs don’t don’t make Macs anymore that

⏹️ ▶️ John you can swap lots of different hard drives into and everything that they’re a little bit better off than everyone else. But

⏹️ ▶️ John doesn’t really help with bit rot. If I write all my family photos to one terabyte SSD they get there successfully

⏹️ ▶️ John and then five years later I have no idea if those bits are flipped the file system can’t tell me because it doesn’t know what bits were originally written

⏹️ ▶️ John all I can read do is read them and say here’s what I’ve got on disk is that what you wrote five years ago beats the hell out of me I don’t

⏹️ ▶️ John know so I’ve been thinking about possible other solutions if Apple never builds in data

⏹️ ▶️ John integrity APFS and one of them is something that I think there’s an HFS plus program

⏹️ ▶️ John it does this already user space solution you just scan all your files checks

⏹️ ▶️ John on them right out the checks on someplace else. And then if anything goes wrong with either the

⏹️ ▶️ John checksum or the data itself, the two won’t match anymore and you will know that

⏹️ ▶️ John your files corrupted and you can restore it from backup. So it’s sort of a user space manual process

⏹️ ▶️ John of scanning all the files that you care about and writing out the checksums and periodically scanning them all again and making sure the

⏹️ ▶️ John check comes all match. And if any of them don’t match, then restoring from your presumably good backup, you know, So

⏹️ ▶️ John it’s basically protecting against bit rot by actively doing some process periodically, hoping that

⏹️ ▶️ John you don’t propagate whatever corruption has appeared into all of your incremental backups. You will know about the error in

⏹️ ▶️ John enough time to restore from your backups. And that’s kind of a heavyweight solution. I forget what the name

⏹️ ▶️ John of the HFS plus program that does that. Clusters, maybe,

⏹️ ▶️ John or maybe that’s just the, the compression one. Anyway, I’ll try to look up for the show notes. Of course,

⏹️ ▶️ John if you if you’re a programmer, you think about this problem for more than a few seconds you realize

⏹️ ▶️ John that there is a big race condition here where if you’re trying to read a file and checksum it

⏹️ ▶️ John and then write out that checksum what if between the time that you read that file and the time that you wrote

⏹️ ▶️ John the checksum something modified that file what if it modified it while you were in the middle of reading it

⏹️ ▶️ John your checksum and your file won’t match and it will appear as if corruption occurred so you could do a thing where you just

⏹️ ▶️ John say okay after I write if I read the file I can’t calculate checksum then I write the checksum. After I

⏹️ ▶️ John write the checksum, I’ll read the file again, and if it doesn’t match the checksum, I won’t assume it’s corrupt, I’ll

⏹️ ▶️ John just try to do it again. You’re just racing yourself over and over again until you get a matching set and you’re like, oh, finally I got a matching set.

⏹️ ▶️ John But you’re not solving the race condition that way. It’s essentially

⏹️ ▶️ John an intractable race condition without help from some other part of the system. Without help, for example, a way to say,

⏹️ ▶️ John I want to stop anything else in the system from writing to this file, and in general operating systems

⏹️ ▶️ John don’t provide ways to do it. They provide advisory locking, but it’s very difficult

⏹️ ▶️ John to stop anything else from writing to a particular file in

⏹️ ▶️ John general. Luckily, both HFS Plus and, I assume, APFS

⏹️ ▶️ John include a feature that can help with this. I don’t know why it was added to HFS Plus, but it’s been there for a couple years

⏹️ ▶️ John now, I think, and it is, what is it called? It’s called the Write Generation

⏹️ ▶️ John Counter. You can get this on any file today on your Mac in HFS+. If you call get adder list

⏹️ ▶️ John It’s called the constant and the get adder list man pages adder common gen count

⏹️ ▶️ John It is a according to documentation nonzero monotonically increasing generation count

⏹️ ▶️ John for this file system object So it’s for all filesystem objects not just files It’s basically a number that goes up anytime

⏹️ ▶️ John anything happens to a file so all you got to do is store both the checksum and the

⏹️ ▶️ John generation count and then you will know this checksum is for this generation count of the file and if anything in

⏹️ ▶️ John the entire system does anything to that file that number will go up and you’ll know your checksum is invalid. So

⏹️ ▶️ John in the future when you’re checking for corruption you won’t be confused and say oh no I’ve detected corruption because you will know

⏹️ ▶️ John that yeah the checksum doesn’t match the file data but that checksum was for generation count number five and the

⏹️ ▶️ John file is currently on generation count number seven so it’s time for you to recompute a checksum. So it is a system guaranteed

⏹️ ▶️ John way for you to match up the checksum and the generation count and the content of the file.

⏹️ ▶️ John And that is very handy. And it makes me think that I’m gonna pull a Marco here, like I should write my first

⏹️ ▶️ John Mac program, which is a file that just does these checksums and use it because I mean, I’m going to say

⏹️ ▶️ John I could write it in Pearl, but I totally could write it in Pearl, but no, like an actual Mac program that does this. It’s really

⏹️ ▶️ John straightforward. The only real problem is where you write the checksums. You could write them to extended attributes. That seems kind of antisocial

⏹️ ▶️ John to me. You could write them to a separate database, but it better be able to handle millions of things. But anyway, I’m actually thinking about

⏹️ ▶️ John this a program, a simple program that you pointed a sub directory that does a bunch of checksums for empirically

⏹️ ▶️ John scrubs it and tells you if anything has gone corrupt.

⏹️ ▶️ Marco I think a local SQL database would be a pretty good way to do that.

⏹️ ▶️ John Do you think that because you have never inserted more than a million rows in a SQL database? I have and I can tell you it falls over

⏹️ ▶️ John badly.

⏹️ ▶️ Marco Even like a simple because I would just be like you know, a string key. Well, God knows how you’d represent the

⏹️ ▶️ Marco file name. But however, you’d represent the file name, a string key, and a check

⏹️ ▶️ John zone. You don’t have to do the file name, but you can do that you have a unique ID for the file.

⏹️ ▶️ John That’s another thing you can get from HFS+. So you’re basically storing two integers? Yeah, it doesn’t

⏹️ ▶️ John do well. If you try to do a graph of inserting, maybe I never

⏹️ ▶️ John tried it with pure integers. I’ve always had some string in there. But once you get into the millions, the inserts start slowing

⏹️ ▶️ John way, way, way down. I know this from experience.

⏹️ ▶️ Marco you provide a custom value to be the row id i

⏹️ ▶️ John would get rid of the the uh the sequence

⏹️ ▶️ Marco yeah get rid of one integer if you can provide if you can provide a custom value to the row id then you then each row would only have

⏹️ ▶️ Marco effectively one column

⏹️ ▶️ John yeah i don’t know why it’s slow maybe it’s maybe i had indexes on it but anyway i did it for like data that was like

⏹️ ▶️ John a couple of integers and a couple of strings and i left the you know the row id and in

⏹️ ▶️ John the millions disaster on you becomes unusable like it’s fine in tens of thousands

⏹️ ▶️ John it’s okay in hundreds of thousands but once you get into millions. And this was maybe like three years ago. So maybe things

⏹️ ▶️ John are improved. But anyway, there are solutions to this. It’s not an impossibility. So I’m saying this. I’ll just start an appeal

⏹️ ▶️ John list. It’ll be fine.

⏹️ ▶️ Marco For whatever it’s worth, I’m pretty sure there is some kind of optimization where if you I think if you provide

⏹️ ▶️ Marco some kind of like big int column instead of the row ID, I think it’ll use that

⏹️ ▶️ Marco as the row ID or there’s something like that. I have to look at documentation.

⏹️ ▶️ Marco, John Are you sure?

⏹️ ▶️ John Yeah, there’s a lot of optimizations to SQLite, you know, to make it sort of less safe, but faster.

⏹️ ▶️ John But you know, doing an extended attributes would be entirely straightforward, but I’m not sure that’s the best solution. Especially if

⏹️ ▶️ John you don’t have, if you have only read permission to the files, and you can’t write extended attributes to them. Anyway, let’s think

⏹️ ▶️ John about it, because extended attributes is totally the way to do it. Time Machine, by the way, does keep checksums

⏹️ ▶️ John with all of its files, and has for many years. I forget where it stores them, probably in extended attributes, but who knows. But anyway,

⏹️ ▶️ John in Time Machine, it owns the entire area that it’s writing to. So it doesn’t have to worry about permissions

⏹️ ▶️ John issues. But anyway, that could be a limitation of the program. Hey, if it can’t check some a bunch of files, it’ll just

⏹️ ▶️ John tell you when it’s done. Yeah, you pointed me this directory and it turns out there were, you know, 7,000 files

⏹️ ▶️ John that I could not check some because I can’t write to them. I don’t know. I’ve been thinking about it a little bit. It is the world’s most boring

⏹️ ▶️ John program, but in the absence of file system support for bit rot detection, we need some

⏹️ ▶️ John solution. And since it’s technically possible on HFS Plus 2, because like I said, this feature, this generation

⏹️ ▶️ John write counter is on HFS Plus now. It’s just that I’ve never really thought about doing it, as I always

⏹️ ▶️ John assumed the file system would come and save me. But if that doesn’t turn out to be the case, I take matters into my own hands. I

⏹️ ▶️ Marco mean, if there’s ever an app that would be perfect for today’s

⏹️ ▶️ Marco John Siracusa to make and publicize, I mean, I can’t think of something that would

⏹️ ▶️ Marco be better.

⏹️ ▶️ John Yeah, I mean, the other solution, as many people will tell you, oh, just use FreeNAS to make a ZFS NAS and put all your

⏹️ ▶️ John data on that. And that would be a solution. Nobody wants to do that. My Synology supports BTRFS

⏹️ ▶️ John now, which I think has data checksumming, but my Synology doesn’t support it. So maybe

⏹️ ▶️ John when my NAS eventually dies and I replace it, I’ll buy one with ZFS and that’ll solve the problem. But in the meantime,

⏹️ ▶️ John I may experiment with this someday. One of the

⏹️ ▶️ Marco best thing is, you’ll get Sherlocked in like two years when they add data integrity.

⏹️ ▶️ Marco, John Yeah, exactly,

⏹️ ▶️ John, Marco so

⏹️ ▶️ John the full developer

⏹️ ▶️ John, Marco experience. Yeah, exactly.

⏹️ ▶️ John Just when I finally get it working, they’ll be like, now APFS has data integrity.

⏹️ ▶️ Marco Yeah, get rejected from the Mac App Store it won’t be sandboxable at all. Yeah.

⏹️ ▶️ Marco Try to sell it, get, you know, there’ll be some like competitor that will sell theirs for less. And then,

⏹️ ▶️ John like I said, I think there’s already a program that does this for HFS plus and presumably that same program will work on

⏹️ ▶️ John APFS. I forget what the name of the program is, but yeah, this is not, this is not a new idea. I think this is already implemented in

⏹️ ▶️ John an application. I just can’t remember the name, which is probably why I don’t have it installed or like, I’m serious about when I think about

⏹️ ▶️ John it, it’s like, why haven’t you done this before? Why aren’t you running the HFS plus one that does this.

⏹️ ▶️ John I was always just thinking this file system would save me. I just thought like, well, I’ll just I’ll stick it out.

⏹️ ▶️ John And there’ll be a new file system here probably next year. And I’ve been thinking that for a lot of years. Finally, finally

⏹️ ▶️ John came no data integrity.

⏹️ ▶️ Casey All right, what else is going on in the in Apple file system? Tell me about space

⏹️ ▶️ Casey, John sharing.

⏹️ ▶️ John There’s there more. Yeah, we got to save that for him. But I just wanted to get a data interest. Why I shuffled around, we’ll save the other

⏹️ ▶️ John details for the rest of the show. These are the main points I wanted to get to that weren’t discussed. talk about space sharing and atomic cones

⏹️ ▶️ John and snapshots next week probably.

⏹️ ▶️ Marco Alrighty. Thanks a lot for our three sponsors this week Fracture, Hover, and Backblaze and we will see

⏹️ ▶️ Marco you next week.

Ending theme

⏹️ ▶️ Casey Now the show is over, they didn’t even mean to begin, Cause

⏹️ ▶️ Casey it was accidental, oh it was accidental.

⏹️ ▶️ Casey John didn’t do any research, Marco and Casey wouldn’t let him, Cause

⏹️ ▶️ Casey it was accidental, oh it was accidental.

⏹️ ▶️ Casey And you can find the show notes at atp.fm,

⏹️ ▶️ John And if you’re into Twitter, you can follow them at

⏹️ ▶️ John C-A-S-E-Y-L-I-S-S

⏹️ ▶️ Casey That’s Casey Liss, M-A-R-C-O-A-R-M, and T.

⏹️ ▶️ Casey Marco Armin,

⏹️ ▶️ John S-I-R-A-C-U-S-A-C-R-A-Q-U-S-A It’s

⏹️ ▶️ John accidental,

⏹️ ▶️ Casey they didn’t

⏹️ ▶️ John mean to Accidental, check podcast

⏹️ ▶️ John so long

Post-show: Email and parenting

⏹️ ▶️ John That’s why we talk about TiVo? I still on the nose, don’t think I forgot about that. We’re gonna get

⏹️ ▶️ John to it someday.

⏹️ ▶️ Marco We’re never gonna talk about TiVo. No, we are. We are gonna talk about TiVo. By the time we get to

⏹️ ▶️ Marco it, people are gonna care even less than they do now, and that’s pretty impressive, really, when you think about it.

⏹️ ▶️ Marco I don’t care.

⏹️ ▶️ Casey, John You will be the

⏹️ ▶️ Casey only one.

⏹️ ▶️ John When Apple buys, what the hell is that name? I keep saying Rovio, but it’s not the Angry Birds company? I

⏹️ ▶️ John, Marco believe

⏹️ ▶️ John so. When Apple buys them, then all of a sudden we’ll care.

⏹️ ▶️ Casey we’ll care. Well then we will talk about it. Until then, don’t see the point.

⏹️ ▶️ Casey But to be fair, I have nothing else to talk about so whatever.

⏹️ ▶️ Marco I discovered during our show that you can now… I don’t know

⏹️ ▶️ Marco when this became possible but mail.app on the Mac now supports

⏹️ ▶️ Marco flagging where the flags can be any color of this like list of multiple colors?

⏹️ ▶️ Marco, Casey Yeah

⏹️ ▶️ Casey it’s been that way for a I thought how long have

⏹️ ▶️ Marco I’ve always just hit you know command shift L or whatever it is To set the flag I’ve never

⏹️ ▶️ Marco known there were multiple flags now I can make my crappy email filing system even worse by

⏹️ ▶️ Marco making all like the flag messages that I never get back to now I can make them different colors. I never get back to

⏹️ ▶️ Marco and color your shame yep

⏹️ ▶️ Casey Wow, you know if you never get back to them. Maybe it’s not worth flagging them. Yeah,

⏹️ ▶️ Marco that’s probably right

⏹️ ▶️ John, Marco no

⏹️ ▶️ John That’s not that’s not how that works He’s flagging them because he thinks he has to get back to them and by saying because I failed

⏹️ ▶️ John to they must not have been Important, but that’s not true One of them could be like pick up Adam from

⏹️ ▶️ John daycare. It’s like well. I never did it so I guess that one’s not important Let me unflag it

⏹️ ▶️ Marco well the good thing is like when when you are as as Crappy and as slacking as I am with

⏹️ ▶️ Marco was dealing with email and responding to email Most email removes

⏹️ ▶️ Marco the need to respond to it if you just don’t do it for like two weeks

⏹️ ▶️ Marco, John Yeah,

⏹️ ▶️ Marco but on the other hand I’m probably losing a lot of friends and colleagues by doing this so it’s definitely

⏹️ ▶️ Marco not free

⏹️ ▶️ Casey Sending email to Marco is sending an email into a black hole. It’s sending it to Devan all

⏹️ ▶️ John We’re just come up with a new topic for a future show. Let me write this in what what email

⏹️ ▶️ John email

⏹️ ▶️ Marco handling that? Yeah, that no one’s ever done a podcast on that before

⏹️ ▶️ John I know but it sounds like you guys have very different systems I think it’s worth

⏹️ ▶️ John, Casey Assuming

⏹️ ▶️ John I have a system

⏹️ ▶️ John, Marco exactly

⏹️ ▶️ John that’s right. This is all show I’m writing in it.

⏹️ ▶️ Marco Yeah, my collection of hacks is not really something I would describe as a system It is

⏹️ ▶️ Marco it is more just like it just a gradual progression of shame. I

⏹️ ▶️ Marco, Casey know

⏹️ ▶️ Casey Wait, you’re adding it in to the show.

⏹️ ▶️ John Yeah, how

⏹️ ▶️ John, Casey we

⏹️ ▶️ John handle email future topic screen time for kids is also I don’t know if it’s still a document But that’s also a

⏹️ ▶️ John topic that I think we should get to that my people have threshold of ancient history It doesn’t say engine your kids just get

⏹️ ▶️ John older and they get more screens Declan’s gonna be getting into screen time now. Adam’s already into screen time.

⏹️ ▶️ John It’s a good topic, it’s a whole show.

⏹️ ▶️ John, Casey Yeah.

⏹️ ▶️ Casey Yes, because that’s what we need to do is a parenting show.

⏹️ ▶️ John Yes, parenting, it’s tech related. It’s tech related parenting stuff.

⏹️ ▶️ Marco We should do the parenting show right after doing the email show to really ensure that we get the most emails.

⏹️ ▶️ Marco Oh God.

⏹️ ▶️ John People love these topics.

⏹️ ▶️ John, Marco I know,

⏹️ ▶️ John technology is part of their lives and part of technology, as Casey will tell you,

⏹️ ▶️ John your feelings about it.

⏹️ ▶️ Casey Are you still on episode five of Analog?

⏹️ ▶️ John Sure.

⏹️ ▶️ John, Casey I think I listen to like

⏹️ ▶️ John random episodes. Like I’m not going, you know, every once in a while there’s a random episode that people say I have to listen

⏹️ ▶️ John to so I listen to it. But yeah, sequentially, yeah.

⏹️ ▶️ Marco I listen to all

⏹️ ▶️ Marco, John of them

⏹️ ▶️ Marco, Casey because I’m a good friend. Thank you, Marco.

⏹️ ▶️ Casey Exactly. You’re the one who cares. You have a lot of free time. That too.

⏹️ ▶️ Casey Because you’re retired or something. Oh goodness. I actually have a busy

⏹️ ▶️ Marco summer.

⏹️ ▶️ Casey Explain.

⏹️ ▶️ Marco Well you know I’m going on vacations. No I’m just…

⏹️ ▶️ Marco, John I

⏹️ ▶️ Marco mean I’m busy trying to parallelize an mp3 encoder from the early 2000s. Like

⏹️ ▶️ John a sign of not being busy? I think that’s what that is.

⏹️ ▶️ Marco This is also just kind of like a procrastination tactics for to avoid me having to dive into the iOS 10

⏹️ ▶️ Marco stuff quite yet.

⏹️ ▶️ Marco, John We’re

⏹️ ▶️ John gonna say what what are you not doing that’s making you paralyze the mp3 encoder?

⏹️ ▶️ Marco I have like a lot of like big things to do on Overcast for iOS 10 just like just you know the obvious

⏹️ ▶️ Marco stuff like rewriting the watch app for watchOS 3 and then you know looking into all the new widget and

⏹️ ▶️ Marco notification stuff everywhere and it’s just gonna be a lot of a lot of work like

⏹️ ▶️ Marco that and some big big projects that like even just starting them is kind of daunting so I

⏹️ ▶️ Marco I haven’t had a lot of good work time in the last few days so I’ve been kind of just like chipping way at side projects and

⏹️ ▶️ Marco administrative work here and there just in the time I have.

⏹️ ▶️ John You’ve been too busy in the bouncy house.

⏹️ ▶️ Marco Yeah, that’s one of the side projects is bouncing.

⏹️ ▶️ John That’ll really take time out of your schedule. Like, sorry, I would be doing development now, but the house is not going to balance itself.