Talk:Truncate a file

From Rosetta Code

File size requirement

So just to be sure, if the actual file size is smaller than the given truncated size, it's an error? It doesn't just leave the file alone? --Mwn3d 14:09, 19 July 2011 (UTC)

POSIX truncate() and ftruncate() extends the file if specified size is larger than original. It's convenient when you want to reserve some disk space. --Ledrug 14:25, 19 July 2011 (UTC)

Copied from below

the requirement that the routine should bail out if requested size is larger than original is unrealistic. If it's important that the file must be the requested size, you should extend it; if it's not important, why not just leave it alone and move on? If you really cared, you should have checked its size beforehand anyway. --Ledrug 17:21, 19 July 2011 (UTC)
Yeah, I would have expected a beforehand check here. However, for the purpose of this task, the truncation is the bit that I hoping to see demonstrated here, rather than determining the filesize. I don't have any objection to extending the file, if that is easier than bailing out. It would be nice to get a warning message that the file has been padded, or maybe we could just place a note against a the solution that the file gets padded, if the provided length is greater than the current length of the file. Markhobley 19:20, 19 July 2011 (UTC)
I'd say something like "For built-in functions, note their behavior when the actual file size is smaller than the desired file size. For user-defined functions, note the choice you made (padding, error message, leaving the file unchanged, etc.)." That leaves enough freedom to make sure there isn't much added to the meat of the solution. --Mwn3d 19:28, 19 July 2011 (UTC)

Assumes unix

It's pretty clear that this task assumes unix file system semantics. So, can we ignore other operating systems in this task? --Rdm 14:33, 19 July 2011 (UTC)

Why is it clear the task assumes unix behavior? --Ledrug 14:37, 19 July 2011 (UTC)
Quoting http://www.conifersystems.com/2008/10/21/windows-vs-unix-file-system-semantics/ The Windows delete and rename model is different. You wouldn’t know this from the Win32 APIs, but in order to delete or rename a file in Windows, you first have to open it! Once you’ve opened it can you call NtSetInformationFile with InformationClass of FileDispositionInformation or FileRenameInformation. Setting FileDispositionInformation doesn’t even delete the file; it merely enables delete-on-close for the file, and the delete-on-close request could very well be cancelled later.
And, as near as I can tell, you have to get rid of the association between a name and a file before you can give another file that name. --Rdm 14:43, 19 July 2011 (UTC)
I see, you are talking about the rename part. Truncating a file on windows works not very differently from unix. But it's true that unlike on unix where filename is just a link pointing to some inode, windows files are closely tied to the names, fair point. --Ledrug 14:55, 19 July 2011 (UTC)
Just an observation...I think the 'rename' bit is to guarantee an atomic replacement in the namespace. NTFS supports transactions which would accomplish this, although it'd be necessary to catch a transaction failure and retry. Other Windows-supported filesystems may or may not support transactions or suitable semantics. (FAT certainly doesn't. I don't know what other filesystems Windows may have native support for.) --Michael Mol 15:43, 19 July 2011 (UTC)
I wrote a DOS extension service that could truncate files by deleting the entries from the file allocation table and updating the directory entry with the new file size once. This was before Microsoft Windows '95 came along and broke everything. The restrictions on semantics are within Microsoft Windows itself, rather than with the filesystem, There may be an interface somewhere that allows truncation, because disk repair tools need to be able to do such things, although I know Microsoft like to retain a monopoly on such tools, so maybe they forgot to document the interface. Markhobley 17:11, 19 July 2011 (UTC)
No, the interfaces for such things are documented. And you're partially right; there's nothing about FAT that prevents someone with knowledge and the ability to lock the device from going in and atomically making those changes. However, if you want the system to remain online, the operation needs to be performed with the driver. The NTFS driver supports transactions because NTFS itself has features which make implementing transactions not prohibitively difficult in a multiprocess environment. If you want, you can take an entire filesystem offline and access the data directly, but that'd be extraordinarily expensive in terms of disruption to other system services with open file handles. And it'd require greater privs than your average user account will have. --Michael Mol 17:48, 19 July 2011 (UTC)
The task is not limited to Unix. It should be possible to truncate a file on most systems that utilize disks and other storage media type devices. Markhobley 17:11, 19 July 2011 (UTC)
Truncating on windows is not a problem, just call SetEndOfFile() or truncate() (NT is actually POSIX.1 compliant). It's the renaming part that's inconvenient (still not a real problem). Another thing, the requirement that the routine should bail out if requested size is larger than original is unrealistic. If it's important that the file must be the requested size, you should extend it; if it's not important, why not just leave it alone and move on? If you really cared, you should have checked its size beforehand anyway. --Ledrug 17:21, 19 July 2011 (UTC)
Yeah, I would have expected a beforehand check here. However, for the purpose of this task, the truncation is the bit that I hoping to see demonstrated here, rather than determining the filesize. I don't have any objection to extending the file, if that is easier than bailing out. It would be nice to get a warning message that the file has been padded, or maybe we could just place a note against a the solution that the file gets padded, if the provided length is greater than the current length of the file. Markhobley 19:20, 19 July 2011 (UTC)
I don't know Microsoft Windows. If it supports truncation, then why is a rename required? Does the truncated file become disassociated with the original filename or something? Presumably, it is not just a simple open,truncate,close sequence involved here. Markhobley 19:38, 19 July 2011 (UTC)
Yes, truncation is supported in the Win32 API. At issue, I thought, was the general compatibility of the rename step. I'd probably recommend leaving the temp-and-rename workaround out of it, if it's strictly truncation that's of interest. Or leave off the requirement of atomicity. Or assume that an delete step on the original file is allowed, if necessary. --Michael Mol 20:44, 19 July 2011 (UTC)
I've left the temp and rename workaround in, so that this can be implemented on systems that have no other mechanisms. A delete of the original file prior to rename is permissible. I have added that to the task description. Markhobley 20:56, 19 July 2011 (UTC)
Is NT still POSIX compliant? I thought that functionality was stripped out and shipped as Windows Services for UNIX, which operates under CSRSS as a different subsystem with bizarre interaction limitations. I could be wrong, of course. --Michael Mol 20:46, 19 July 2011 (UTC)
NT4 and 2K were POSIX level 1 compliant. I think starting from XP some versions of Windows have POSIX layer by default, some don't. I'm not sure if SfU is the same as the POSIX subsystem: I thought SfU is a much more complete compatibility package. At least that's what I heard, but I haven't seriously used Windows for years. --Ledrug 21:04, 19 July 2011 (UTC)

I think this is ready to be promoted to task. Markhobley 17:27, 13 August 2011 (UTC)