Visit the Community Portal Archives
The following duscussion has been copypasted from User talk:-jkb-/Temporary Scriptorium where the community discussion took part from March 31 2017 to April 6 2017 when this "traditional" talk page was not available. For the history of the provisory talk page see here. ---jkb- (talk) 17:41, 6 April 2017 (UTC)
- 1 On Quality at Wikilivres
- 3 a suggestion for something new for the main page (EN; though certainly not opposed to expanding it to other langs)
- 4 page-sectioning
- 5 File Size Discussion
- 6 Filetypes and Upload Limits - Discussion
- 7 Sharing cost
- 8 Deletion of old content because it is somewhere else
- 9 Category:Quality notations
- 10 Format Notations - Category ?
- 11 Copyright and Inclusion Policy
- 12 Inclusion Policy on Licenses
- 13 Unique visitors to Wikilivres now 1,000+ Day
- 14 subpages for discussions
- 15 spam accounts -- RESOLVED
- 16 Pages disappeared
- 17 .DJVU format drawbacks and alternatives
- 18 Pauly-Wissowa RE
- 19 Wikilivres.ca domain will DROP - Next Steps
- 20 Dead Man's Switch
- 21 Expense Report for last 60 days
- 22 Purpose and Criteria for inclusion in Biblio.wiki
- 23 Bibliowiki Namespace Impact
- 24 Former admins
- 25 Logo - transitional implemented -
- 26 Threats to Canada's Life+50 Copyright Laws
- 27 Bibliowiki:Donate
- 28 Cannot specify email address
- 29 No magic, a simple explanation?
- 30 Lots of Spam - Change User Creation Process? FIXED
- 31 Maria Montessori Works - Public Domain Status?
- 32 Beatrix Potter to be featured on English Wikisource in December 2017
- 33 Fichiers DjVu manquants
- 34 Taille maximale du fichier
- 35 Admin/Bureaucrat Requirements/ Hosting Expenses -- RESOLVED
- 36 Request to delete the page
- 37 Happy New Year!
- 38 DJVU and PDF file management proposal
On Quality at Wikilivres
There are still many technical issues to be ironed out, however (I hope) there are improvements every few days. The technical goal is a platform that supports the work of the Wikilivres editors, and makes Wikilivres better for all visitors. From a technical quality approach, we have limited resources (storage space, bandwidth). This means the focus should not be on quantity but on quality (and as much of that as possible). ٩(˃̶͈̀௰˂̶͈́)و
Public Domain Quality
Since we are dealing with a 20 year (or less) lifespan as the place where documents can reside in the public domain (until the rest of the world catches up), It may be important to keep in mind that what we bring here should be worth our labor. For some texts, there will be much more attention paid when they reach a PD+70 term. PD-50 is really about making works freely accessible (freedom from their (now expired) copyright, and free in terms of freely available, on this site). There are a few important categories of documents (books, images, etc.) that would be worth the effort of editors here:
- Hard to find or out-of-print
- Otherwise restricted by copyright holder
- Classics (hard to define, but that have importance now or historically)
The best candidates for Wikilivres (IMO) are expensive, hard-to-find, restricted classics. The worst candidates would be cheap, easy-to-find, mediocre works.
Of course, I am only one editor with one opiniion, and the work of Wikilivres is as broad and varied as all the collective editors who put in their time and effort, whatever they work on. As a technical resource, my aim is to empower editors, while improving technical quality. As an editor myself, my goal is to improve public domain quality. Thank you for your patience during this technical transition, and I look forward to working together, and working separately on our own projects at Wikilivres. Keep the technical requests coming! --Jeffmcneill (talk) 05:15, 7 April 2017 (UTC)
comment -- do not disagree about the desirability of quality works, but:
we are not the only site "publishing" pd pma-50 content; nor the only one "griding out" proofed-copy
(setting aside the undesirability of repeating proof-reading work done on the SAME-titles @ other projects; which is can be considered as a technical point of operations)
what makes us "special" is the flexibility of being an open wiki project.
we can get stuff up faster, organise it better, & cross-link freely with other projects.
when a reader/user comes here, & looks up an author, they should be able to find a complete bibliography,
with access-links to as many of the author's works as possible.
(ideally, it would also be tightly-integrated with wikipedia>wikisource, wmcommons, etc.)
whether those works are hosted on our site or not (though it is certainly desirable that the hosting-resources should be stable & reliable), is an open question. by "default" i've been going on the assumption that "if it's not on wikisource, it should be here" (with interlinks both ways); but i'm open to revising that approach.
we do have better organisation of material than the gutenbergs, however; & we offer titles from any source, not just the ones that have been proofed here.
tl;dr -- if we're going to bother to do this at all, then we should try to do it better than the other projects.
in particular "better" in terms of real-world, practical usefulness to readers/end-users.
otherwise why are we here?
- I define better by the quality hallmarks mentioned above. Certainly with limited resources we won't be able to do better as in more complete, which I think is a recipe for burnout and low quality. --Jeffmcneill (talk) 02:49, 8 April 2017 (UTC)
- then the question becomes why are we doing this?
- literally; IF/THEN
- project gutenberg canada has a very nice distributed proofreading project, & they can at least turn out a few dozen titles a year.
- why don't we just join them, & save ourselves the trouble of operating this site? especially if the goal is to crawl out maybe 6 books a year. it's not worth it; operating a standalone project just for that.
- I worked on Project Gutenberg at about 2010. Maybe it all changed there now, but at that time to start a new project I needed permission from the admins. Also I was the only Russian working there at that time. That's why I moved to wikisource. Here everything is like wikisource, therefore it suits me. -- VadimVMog (talk) 05:59, 10 April 2017 (UTC)
- ad "permissions": something similar you can find on the German wikisource - the community likes to permit a project, if not, you cannot do anything. If you know the situation on some wikisources so it is a good idea to incereas the quality, as many user start to insert a huge work of somebody and after formating one chapter the are gone away. All you can find from them is awfull and worthless junk. -jkb- (talk) 08:51, 10 April 2017 (UTC)
- :) You are right, the problem exists. Sometimes I'd like to do something like this on Russian wikisource -- lots of vandals. Still I remember me feelings at that time and I left Project Gutenberg. On the other hand trusting every new user, we show respect to good people and they stay. It takes more efforts to clean, that's the price. -- VadimVMog (talk) 17:09, 10 April 2017 (UTC)
- P. S. Returning to the quality of this site. If there will be no admin cleaning out all dirt, life itself will force us to implement some strict policy to keep quality: either not to allow new users at all (like it was on wikilivres.ca) or to control new users (premoderation). What concerns our work quality, we can only create guidelines. What else can we do? -- VadimVMog (talk) 18:51, 10 April 2017 (UTC)
In my opinion the quality of edition is important. In my area I try to apply here the standards we have developed on Polish Wikisource and "my" wikii site "Ogród Petenery" ("The Garden of Petenera", see: wikia:c:wiersze), the wiki with Polish poetry published on free licences and that gathers old PD texts, too. Electron ツ ➧☎ 09:50, 10 April 2017 (UTC)
- re @ VadimVMog: I hoped we shall need no special guidelines or rules as both on wikisource and wikilivres.ca we had no problems with this - so nearly everybody used to present good OCRed texts, with headers, chapters, source, npo copyviolation, with the right licence etc. But might be we shall have to discuss this again. In last time there is a quite big amount of texts that do not fit these expectations (@Simon Peter Hughes: has been correcting many of the - many thanks for it). See #how the pages in Wikilivres are to be like as wsell, but see also the in the meantime archived talk deletion appeal. I'm pretty sure that some texts that are not formated in the normal way and quality we know from wikisource do not belong to the main namespace. Wehen I come back in some two weeks I shall suggest that (if not deleted) these texts will be moved immediately either to a user namespace or to a special namespace, until they fit our quality requirement and expectation. (The same problem is the question, if we want pdf-files in the main namespace etc.) -jkb- (talk) 21:27, 10 April 2017 (UTC)
a suggestion for something new for the main page (EN; though certainly not opposed to expanding it to other langs)
now that things are almost "back to normal" on the project, something i was thinking of proposing @ the old site, just before everything went blooey:
what about adding a PD comics section to the main page?
updated daily or weekly, say (can promise weekly, cannot guarantee daily by myself)
there are a wide range of titles to choose from; we could plan a rotation, or randomise, or etc.
would be fun, & would add a little variety to our mainpage; which is rather "static" & doesn't get updated a lot.
- A featured works section regarding new works might be nice. Do we have comics? Is that something important to this community? I see a lot of scientific nonfiction and literature here, maybe I'm missing something. --Jeffmcneill (talk) 02:49, 8 April 2017 (UTC)
- i started uploading buck rogers strips sort-of-daily, before the old site went down. only 4 of them made it into the new site.
- the are at least dozens of major comics of the past that are now pd, & for which at least some of the work is find-able. i'm willing to spend a limited amount of time on that; i could promise at least a weekly update to a comic on the mainpage; which is more regular attention than the mainpage has received, ever.
- presently we have a small collection of art-images, mostly from commons i think, that we can hosty, & they can't. like the kandinsky thing on your talkpage.
tl;dr -- the problem has always been "not having (enough) people to do the work". (actually, we have a new/recent works section on the mainpage; it was last updated in early 2016 xD) Lx 121 (talk) 05:38, 8 April 2017 (UTC)
our present title-header template seems to be doing something that "supresses" the little wiki menu-box for page-sections.
i need it to stop doing that.
ultimately, the "ideal form" for our texts seems to be "wikisource-style"; with a book broken down into multiple pages, chapter-by-chapter.
doing this is VERY time consuming (particularly absent auto-tools).
as an INTERIM STAGE in this process, i have been experimenting with "sectioning" the text (i.e.: also "parting out" the book, but on a single page).
it dramatically increases page-navigability, & is not nearly as time-consuming as creating multiple pages manually.
ALSO this makes it easier to divide a book into multiple pages later. (this could even be automated & done by a bot; with human review)
unfortunately, something in our page formatting setup is "supressing" the menu-box with the page-section links. i assume it is the header.
IF somebody can fix that, that would go a long way to resolving certain disagreements we are having elsewhere, about copy.
- Please feel free to look at the templates and suggest on their talk pages what changes to make. You can experiment in your sandbox to get the right look. It may be something like __NOTOC__ that is doing what you describe.
p.s.: two observations
1. any "page display size restriction", as on certain types of devices, or etc. will always be a temporary, "ephemeral" limit. every "next" generation of software & hardware will "move the bar" higher. 1mb(+) is NOT an unreasonable page-size; IF we are going to worry about this, then at least, we need to keep UP TO DATE on the ever-changing technical standards involved.
- While storage is growing quickly, bandwidth is not so much. There are a large number of projects (e.g., Google AMP) that try and deal with high-latency, low-bandwith networks (especially for mobile) so the idea that large page sizes will increase (and 1mb is acceptable) is not bourne out. In order to be liked by search engines, and by users, it is important to keep the page size small. I've been watching this for the past 20 years and that ain't gonna change anytime soon (as much as we'd all like it to). Page size is not the place to put the constraint (e.g., certain size of all page elements) but rather the user experience of how long it takes to load enough of a page to being interacting with the content. Two seconds is a useful goal to strive for. Less than that, even better, but more than that and a closer look at how the pages are being delivered (and size of assets) is needed. --Jeffmcneill (talk) 03:05, 8 April 2017 (UTC)
- comment -- mediawiki is incapable of delivering 2 second page loads now; not on wikipedia, & not here. it's nice to have goals; but we're going to have to change software to deliver that number in "real world" use.
- Wikipedia achieves 1 second load time, and this wiki is currently at 2.9 seconds (using GTMetrix and a Vancouver location). We can get to 2 seconds in North America, certainly. These are the kind of technical things I have done before with other sites. However, the page size can't be big to achieve that. --Jeffmcneill (talk) 16:37, 11 April 2017 (UTC)
- comment -- mediawiki is incapable of delivering 2 second page loads now; not on wikipedia, & not here. it's nice to have goals; but we're going to have to change software to deliver that number in "real world" use.
- & i understand what you are saying about low-bandwidth, but even on low-bandwith 1mb is "small change" nowdays. at least in the developed world; except maybe for extremely rural, or wilderness areas. phones use more than that, just "talking" to the network.
- I'm sorry but that is not accurate, and it seems the argument is on the one hand that large page size is ok, and on the other that if we really want to address the issue we have to have very small page sizes. The happy medium is achievable. I can do the server side optimization. Content sizes need to be reasonable. There is no other way. --Jeffmcneill (talk) 16:37, 11 April 2017 (UTC)
2. in its present state, our site is not set up to be "friendly" to users trying to read or download a long text on a mobile device, with a small screen; whatever the page size. IF becoming mobile-device friendly is a priority, then we need to get serious & set up a working group for that. because, right now "we ain't got that".
- I agree about the usability of reading on Wikilivres, and also about mobile-friendliness (two different, but related issues). --Jeffmcneill (talk) 03:05, 8 April 2017 (UTC)
3. uploading a large text to a single page is one stage in "processing"; by all means, DO move it along to the next stage. if or when you have the time & inclination to do so.
BUT if the choice is between a) having a text, & b) NOT having a text, then as far as end-users are concerned, something is better than NOTHING.
&, at the end of the day, end-users MATTER. we are here to provide a service, & resource: free copies of PD content, for people to USE.
- Actually a bunch of poorly formated text is probably the same functional value to an end-user, not usable. If the idea is that someone feels they can dump poorly formated text into Wikilivres and it is up to others to do the tedious text formatting, that seems mistaken. --Jeffmcneill (talk) 03:05, 8 April 2017 (UTC)
- 2 points:
- 1. for an end-user, in real-world terms, the difference between "we have a copy of this text" (even a less thjan perfect one), & "we don't have a copy of this text" is infinity. it is the functional value difference between 1 & 0.
- 2. this is a wiki; hence a colabourative project. where people freely chose to come, & contribute the work they are interested in doing.
- if we are going to create a bunch of passive-agressively restrictive rules, to "force" people to do things, we are never going to attract a userbase.
- after 14 years of operation, & with all the advantages of wikimedia & wikipedia, ws/en has an active user base of about 100 ppl; & an annual output of work that is a fraction of what gutenberg usa puts out in the same time period.
- if we simply, dumbly imitate wikisource, we are imitating a failing endeavour; and/or "playing club". we don't need this project to do that, or to dumbly imitate gutenberg canada; why bother doing that? especially when we could all just join there.
- i have uploaded about 1-2 works per day, into the english section; which is more content than the english section has gotten in a very long time. many of them are very hard to locate online.
- (i feel it is also worth pointing out that the user who raised the point doesn't even work in the english section)
- my goal is a minimum average of 1 work uploaded per day.
- i have also vastly improved the coverage of the bibliography pages of the authors i've worked on.
- & included links to works elsewhere (ws & the gutenbergs).
- for end-users; for ordinary people who come here looking for stuff, & who don't care about anything else on the wiki but that, the "utility" of these author pages has been increased at least exponentially.
- if you really & truly do not understand the "value" that this adds to the project, then i am sorry for you; & sorry for the project.
(there is also a "legally useful" aspect to our work; in that by PUBLISHING content we are "asserting" public-domain rights, & thereby protecting public-domain rights. use it, or risk losing it)
- One does not need to publish to preserve/protect public domain rights. There is no loss of rights in the public domain (that is actually the default), the loss of rights is to the author's exclusive rights (that are limited by law). --Jeffmcneill (talk) 03:05, 8 April 2017 (UTC)
- "on paper" yes. in practice -- legal rights that are not "exercised" & defended are
- "on paper" yes. in practice -- legal rights that are not "exercised" & defended are
- Sorry but this is not factually correct. One must defend exclusive rights, but the default is a lack of exclusive rights (and therefore the public domain). Publishing/distributing public domain works has zero legal impact or value. We need to be clear on this. --Jeffmcneill (talk) 16:37, 11 April 2017 (UTC)
- when the usa started, their copyright was 7 years. then it was 14. then 28. then 56. then 95-120 years for "old" works, & 70 years pma for "new" works.
- in practical terms NOTHING has entered the public domain in the usa since 1998 (via expired copyright). nothing is "due" to enter it, until 2019
- before that, there were already repeated copyright term extensions enacted by congress; starting in the 1970s.
- in practical terms, it is increasingly unclear if anything is going to enter the public domain in the USA, ever again (via expired copyright).
- & canada almost lost pd pma-50 to the TPP.
- one of the several useful purposes that this site serves, is to exercise our legal right to the public domain, by publishing pd content;
- especially recently-pd content.
for a simple analogy, what we are doing is "beating the bounds" (marking out the boundaries) of the public domain in canada, with full & regular "updates" to those bounds.
- This is not legally correct. Let's be accurate here. --Jeffmcneill (talk) 16:37, 11 April 2017 (UTC)
File Size Discussion
- Moved from Bibliowiki:Technical requests
Filetypes and Upload Limits - Discussion
Hi everyone. I can increase the range of acceptable filetypes and also file size. However, we don't want a situation where the disk gets full immediatelly. First, what filetypes do people need? Certainly: .gif, .jpg, .png,.pdf,.djvu, .epub. Compressed files are more difficult to deal with (where does the decompression happen), but maybe there is something clever where they are compressed and decompressed on the fly (I'll look into this). Let me know more about .xml filetypes, as I am usure what is in this format or how it would be used by the wiki or by users. What size files are reasonable, in the opinion of Wikilivres editors? For example, what fits 50%, 80% and 100% of what people have to upload?
While unlimited storage will be nice at some point in the future, there are certain issues (such as backup window expansion, and paying for that storage). Thoughts? --Jeffmcneill (talk) 06:46, 4 April 2017 (UTC)
- comment - would be nice to offer a full range of media file types; though even w/o storage issues the selection is going to be limited (1st disney cartoons go pd in a few yrs though; once ub iwerks reaches pma-50)
- optical page scans use way more data than ditigised text, but most books i've been finding come in under 30 megs; many well under. a few of the bigger items, like some of churchill's grand histories, are bigger though. nothing i've seen was @ or over 100 megs; except one a a milne text that was ridiculously large (1.5 gb; i assume because somebody made a bad choice in the settings somewhere. would not even want to try uploading that here; it is unuseable. might try compressing it, if i can't find a smaller copy)
- for graphical books though, total file sizes might be bigger; but it's better to have those broken down page by page anyway.
- tl;dr - for optical-scan ebooks 90%+ should fit under 30 megs; maybe 50% of them will be less than 15 mb.
- we could also explore ways to compress items more.
- but for a few special works, we might need to make exceptions (at least until we can digitise the text).
- for digitised text, without "photographs" of the book's pages, file sizes are much less.
- don't know enough about the options, to say anything useful about hosting storage cost considerations right now. except the obvious that free is always good, & also it would be lovely to find generous sponsors/partners for our project xD
- p.s.: to clarify; i need this only for books where the only available copy is a "pure" optical scan (w/o the text digitised). so far, that is a minority of the works, but it includes a number of really "key"/important items. Lx 121 (talk) 07:35, 4 April 2017 (UTC)
- I'm afraid I will slightly oppose: Wikisource (and Wikilivres as well) are primarily for optical text, and we mus discus sometimes, what we want to publish. So I can remember there was a discussion about the great amount of PDF's that nevber have been digitlized (Eclecticology demanded a digitalization, I have the same point of view). And, if we can use the depository of Commons - why to pay for it? -jkb- (talk) 07:45, 4 April 2017 (UTC)
- comment -- wikilivres is not just a "copy" of wikisource. it never has been, nor should it be.
- & our collection is more that just texts, & it always has been.
- & the answer to "why not just use commons?" is obvious, & it is exactly the same as the answer to "why not just use wikisource?"
- BECAUSE WIKILIVRES IS PD PMA-50, & commons is NOT
- if you only want to work on bare texts, then i have no objections to this; but in return, i ask that you please do not object to other people who want to work on other things.
- Live, & let live? :)
Quote from the project description @ the top of the main page
The purpose of this site is to host texts and images in the public domain, or under a free licence. This site is hosted and managed in Canada and therefore it follows Canadian copyright law. Unless otherwise stated, all texts hosted here are in the public domain or under a free license. For a detailed discussion see Inclusion policy. You are welcome to publish files and texts here if they cannot be accepted in Wikimedia Commons and Wikisource. Works that can be accepted at those projects should be published there. You can test the publishing and editing processes used on this site in the Sandbox. This site has more than 3,000 books and documents, and 11,563 images in Template:PAGESINNAMESPACE:0 pages from more than 1,085 authors. This site does not belong to the Wikimedia Foundation.
(also, @ some point in the future, we might want to provide services such as audio books)
- Oh, thanks a lot - now and here I've learnd what I didn't know before ;-), and more over on a very kind a collegial way -jkb- (talk) 08:47, 4 April 2017 (UTC)
- Yes, a lot of files were transfered from Commons before being deleted there. Regards, Yann (talk) 12:56, 4 April 2017 (UTC)
Update on Filesizes / Filetypes
Hi folks, there are a few things to discuss, based on an analysis of the images folder content:
- 121 items over 8mb in size. - 75 items over 16mb in size. - 26 items over 32mb in size. - Top four are images, from 56-93 mb in size (three Matisse and a Derain).
There are also: .ogv, .ogg., .odt, .mid files (very few). Mostly the files are .djvu, .pdf, .jpg/.jpeg, .gif, .png, and a few .svg
There is one book that includes what appear to be pages from a German excyclopedia on the Classics, which consists of 8,274 png files. This is 1.97gb of images. This was user K67y. Maybe they were looking to turn the pages into text? Unclear (though that is what the Wikisource topic has done). In any case, the project had been worked on from 2010 until mid July 2016 on and off. The project looks to be interesting, but very little headway was made: Category:Paulys_Realencyclopädie_der_classischen_Altertumswissenschaft. However, an enormous number of page images were uploaded. See <https://de.wikisource.org/wiki/Paulys_Realencyclop%C3%A4die_der_classischen_Altertumswissenschaft>.
There are actually not very many PDFs in Wikilivres. Most files are images of some kind, and there are also many .djvu files.
Suggestions and Questions
- The large scale importing of one-imape-per-page should be avoided. This project would have made more progress if some number of pages were imported, and then the text part were done for those pages, and so forth. Thoughts? - Images (under PD-50 copyright) seems to be a natural extension (and indeed, many books have images). I especially think works of art are of interest. Thoughts? - Audio and video (under PD-50) is very interesting, but the file sizes can get very large. Also, if video (or audio) becomes popular, then monthly bandwidth limits could be impacted. I suggest keeping these filetypes disabled. Thoughts? - Any thoughts on what would be encouraged / discouraged regarding filetypes (especially pdf image files (that can be very large)?
- Until we find some money to support enough space, I would turn off uploading of new files, as it was on wikilivres.ca. Only texts. Right now, as I understand, we experience problems financing current site, which is only $35. -- VadimVMog (talk) 07:12, 7 April 2017 (UTC)
- Hi User:VadimVMog, I agree it seems we have no resources. But I think people are not sure how long this server will stick around, before investing. I'd like to build confidence, and one way is to allow uploads. I've set a limit of 16mb currently, and for the image and ebook filetypes. That should cover most text-based ebooks. If people have something bigger to upload, they can discuss with the other editors to gain support. I'm sure (I hope?) we will have enough donations by the end of April for the domain name. --Jeffmcneill (talk) 09:08, 7 April 2017 (UTC)
- I asked on Russian wikisource for donations to this site, but not many people work on Russian wikisource. So I wouldn't put my hopes on it. Speaking of space and filesizes, I would count, how much space we should reserve to text. Then it would be possible to set a limit for images, the limit we do not want to cross. I would like to upload images for books, that adds a lot to a book. I think, I can live without pdf's and djvu's. It is offen possible to provide a link to outside pdf or djvu. -- VadimVMog (talk) 09:38, 7 April 2017 (UTC)
- Currently we are at about 6.4gb of images (including thumbnails), but ~2gb of that is a single work as mentioned earlier. The database is about the same size. Looking at 10% growth/month, we could add 640mb in files/month and have more than a year, while keeping the same amount reserved for pages/database. I do agree that the pages should be a priority. Djvu is definitely superior to PDF and should be the first choice of formats. I'll look at organizing digitizing, ocr and file conversion information. --Jeffmcneill (talk) 13:08, 7 April 2017 (UTC)
- Many of previously uploaded files can be delated, because they were uploaded from Commons couse we had problem with strigth links to Commons - they don't work properly and nobody fix it. Now they work, so we can delete the files and after that we can delate them stright from our inner wiki base to save the room. I do not know that it is a big problem, and all can be dane, but it is a good idea to think... Electron ツ ➧☎ 08:41, 8 April 2017 (UTC)
upload size limit seems to have been changed (since yesterday); now it is 16 mb.
is that where we are setting it? would sort of like a bit more "room" (esp for the churchill histories, & some other "major works"); but moreso wanted to be clear on it, one way or the other, so i can plan accordingly.
- I opened up the upload size so I could get all the old stuff in, and that high limit was not meant to be permanent. Sorry if that was the sense anyone got. 16mb is not huge, but it should support a lot of texts, images, and ebooks. Anything larger can always be discussed.
- One issue with files is not the original size, but backups. If we want reliable backups from different times in the past, we need to have multiple versions. For example, daily for 1 week, weekly for 1 month, and monthly for 1 year = 15 backups. Multipy this by the cost of backup storage, and of transferring the file to backups and there is a non-zero cost of every mb. That said, priority should be on text-based documents, and images that are not huge. PDFs that are image-only (not searchable) are the least favorite kind of content, from a filesize perspective, but also from a user usablity and search-engine findability perspective, much less delivering content over weak networks to limited devices.
- I think getting decent OCR available will help fix this situation, more than increasing the file upload limit. Thoughts? --Jeffmcneill (talk) 10:27, 11 April 2017 (UTC)
- aree about the desirability of text-readable books-files; but for many works in our "range" we're lucky to get anything; & for many of the "biggies" like churchill's histories, image-only is all i can find presently (unless we buy & copy the ebooks; i have no funds for that, though). will look on fileshare when i can, but right now i can't "dedicate" a machine to it (my desktop unit is down).
- aree that having ocr-capability will help greatly.
- but for larger works, or more graphical works, they just won't fit into 16mb. perhaps we could have a case-by-case process for that? or a "special project", or etc.?
New Filetypes Supported
- please increase the range of file types accepted for upload? .pdf .djvu .epub most critical; possibly .gz & .xml moderately larger max file sizes would be nice too; i have a growing number of churchill, milne, & mencken pdfs (optical, not ocr'd) that are too big to fit here (>__<) Lx 121 (talk) 22:38, 3 April 2017 (UTC)
- This is partially implemented, as there are some special extensions to manage the viewing experience of .djvu and others. However, the file extensions are now supported in upload:
- Here are the supported file extensions: 'djvu', 'epub', 'gif', 'jpg', 'jpeg', 'pdf', 'png' 'svg', 'tif', 'tiff'. I've also increased file sizes. But there should be consensus on what that limit is, and also the need to not have optical-only pdfs, if at all possible. --Jeffmcneill (talk) 16:38, 6 April 2017 (UTC)
Hi, There is a bit of this discussion in the domain section above, but we can enlarge the idea. I can't financially support a whole site like Wikilivres now, but I can help a little bit. What do you think? Is the Paypal account email@example.com OK for that? Regards, Yann (talk) 11:05, 13 April 2017 (UTC)
- Hi @Yann: I'd like to take the money so far and reimburse @Koavf: for the wikilivres.ca renewal, as that was what the money was going to be used for (domain registration). I'll set up an email at firstname.lastname@example.org in a few days and connect that to paypal, so it looks more legitimate. Costs are not that expensive, from a hosting perspective. I found a great deal on the current host $10/mo (USD). Though there is also DNS ($ 0.50 (fifty cents) monthly), and backup costs (to Amazon S3 (maybe $5/mo, depends on size of backups). Another option is to get sponsors for certain specific content. --Jeffmcneill (talk) 13:54, 13 April 2017 (UTC)
- This is much cheaper that what I paid, around 50 US$/month, not including domains. Regards, Yann (talk) 18:13, 13 April 2017 (UTC)
- This is the cheapest I've paid also, especially since we have 8gb of ram and 4 processors. But the host doesn't have any extra tools like Linode does (for which we pay about 2.5x, including daily full backups of the entire system). Hosting costs keep going down, also backup storage costs, though not as fast. I figure < $20/mo not including the domain. Though depending on what kind of growth path we will be on, that will increase over time. Also, the server resources and kinds of backups are better than the previous configuration. We get more for less (if only the rest of life were this way). --Jeffmcneill (talk) 04:40, 14 April 2017 (UTC)
- This is much cheaper that what I paid, around 50 US$/month, not including domains. Regards, Yann (talk) 18:13, 13 April 2017 (UTC)
- comment, various -- what is the DNS fee for, exactly?
- re; backups: i think we were getting free backups from the wikiteam ppl? though not instant or daily.
- very cool that we got the name back! any hope of recovering he rest of the lost content?
- no objection to the idea of sponsors. we could even consider advertising; as long as it is carefully chosen. we do not have the same conflict-of-interest problems as wikipedia; in that we are not creating article "about" topics/subjects. we are "just" hosting content; & our main mission & purview are pretty clearly defined.
- we would need to have some kind of formal, credible "business structure" to operate on that level though; a "foundation" of our own? or to join somebody else's. back on the old site, was wondering if we could get one of the wikipedia regional chapters to take an interest in the project.
- re: hosting: as long as it's reliable, & has the capacity we need for storage, bandwidth, etc. cheap is probably more important than service "extras". we can always make or arrange for our own tools.
- re: restore, lost content. understood that was it for the wikiteam backup. was wondering if we stood any chance of getting any of the lost content from the old host (since we have recovered the domain name from them)? stuff from after the last backup was made; abt a month's worth of work. Lx 121 (talk) 10:14, 15 April 2017 (UTC)
- Hi @Lx 121: we don't have access to the actual host, only have the registrar nameservers for the domain point toward my servers. The hosting package probably expired earlier, and was then deleted (which is when Wikilivres became offline). We could get the registrar/domain name back working, but the server configuration and all the files are gone. Best to put out of your mind there is any possibility of recovery. We have what we have and are moving forward. --Jeffmcneill (talk) 04:13, 19 April 2017 (UTC)
- Hi, I do not want advertising on Wikilivres. Otherwise I would have done it long ago, and I wouldn't have to transfer the site to someone else in the first place. Seeing the hosting cost now, I don't see the point anyway. Regards, Yann (talk) 08:35, 16 April 2017 (UTC)
- I can respect that. How about sponsorship of selected pages? If Wikilivres is to grow, especially with more functionality (audio/video) then there are greater hosting costs. If we wanted to allow for larger collections of high-resolution artwork, again, more storage is more hosting costs. What I am thinking of would have to be acceptable by those who work on particular works, and not violate link-buying rules for Google. An example site-wide would be asking the hosting company to sponsor and provide a hosting provided by XYZ link at the bottom of the page. I'm not sure we want to do site-wide links, but this is an example of unintrusive sponsorship.
Deletion of old content because it is somewhere else
I'm seeing content deletion for content that has been on Wikilivres for many, many years, sometimes close to a decade. The reason provided, is that the content is at another location (e.g., Wikisource or WikiCommons). I'd like more clarity from the admins about what was policy in the past regarding duplicate content. It concerns me because removing content from here doesn't mean the content will remain at other locations. More importantly, content that has had a history here, is linked to from other sites, and has visitors based on those links, and the signals it provides the search engintes. Deleting content and not updating links breaks the web. Creating external redirects is difficult to do on MediaWiki, and has many negative points to it, so healing broken links is generally not plausible. While it is good to have a focused purpose for Wikilivres, a temporary shelving area that services other wikis is probably not the main goal. I'd really like admins to help me understand the history and practice regarding content deletion in the past. --Jeffmcneill (talk) 06:23, 15 April 2017 (UTC)
- which content in particular is getting deleted? haven't been watching closely, but the only thing i've noticed was one user's approach to "housekeeping"; deleting local image files which are duplicated from commons, giving "preference" to simply linking to the files there. if more than that is going, i haven't seen/noticed it yet.
- not clear on what or whether we had "policy" on this (duplication) beyond the basic idea that we were here to host material that can't be hosted @ the wikimedia projects (due to copyright rules there). setting aside image content that meets those criteria, i think "we" just started uploading the author-image files here when/because linking with those files on commons stopped working.
- i'm flexible on what policy we should have for this.
- 2 obvious points:
- 1. the other site needs to be a stable, reliable online resource
- 2. if we keep it here, we need to consider the impact that has on our site's hosting resources.
- As I remeber some content, esp. in Russian was moved to wikilivres.ru, because they keep non comercial, and non derivate works too; we don't keep them any more after our policy was change a few years ago. And some content was deleted because it was moved to wikisource, because it is PD now evrywhere, not only in Canada. I haven't deleted some Polish works here (although some of them were move to other services, already) because in my opinion it is better if they can be accessed from different servises. It is also more safer for them and we don't gain any new space if we deleted them here, because if they are "deleted" they are hide only for non admins and they still staked out a place on the server. Electron ツ ➧☎ 10:09, 15 April 2017 (UTC)
- Ray enforced a much stricter license and content policy than I, and deleted everything which was under a NC license, and most of what was already hosted on Wikisource. Copying what is on Wikisource is useless, but I don't see the point of deleting what is already here. It won't save storage space, and it will break some external links to our content. Regards, Yann (talk) 08:39, 16 April 2017 (UTC)
- I agree there is no point in deleting stuff, with only very few special exceptions. Also, if pages/content are popular, they should not be deleted, and we won't know that until we get a better picture from the analytics I have running. Even the 2,7xx images (2gb) from that German encyclopedia, that drives me crazy, are getting visits, and I need to look closer to see if it is useful to our visitors, before considering deleting that content. Please folks, do not delete content unless it violates copyright or trademark policy. We can discuss deletions if people feel strongly content should not be here, but please discuss first. This is for both old and new content. --Jeffmcneill (talk)
Beside of the question of licenses (like NC - see above) there is no reason to delete something, that was published in Wikilivres according to the scope of this project. If some years later the text is not only PD 50 but even PD 70 - so what? Once it was publiwshed here and hows what we have done in the past. And we surely published the text before it was publishd on any Wikisource. Is is our past, the sign of our work and we shpould show all we have done. (OK, I speak here about texts, not files.) No deletions please. -jkb- (talk) 21:51, 24 April 2017 (UTC)
The current Quality notations categories are as follows, with # of pages in each category, and the English description.
- Category:0% (144) - Works in project
- Category:20% (1) - Incomplete works
- Category:25% (289) - Incomplete works
- Category:50% (400) - Works completed but typography and layout to be corrected
- Category:70% (54) - Incomplete works
- Category:75% (5,298) - Works completed, including typography and layout
- Category:100% (2,974) - Proofread articles, on Wikilivres or by an external party
- It seems that 0% are NC works that have been blanked (contents removed but not deleted).
- 20% has 1 in it, so easy to move that to 25%
- 70% description seems to be the same as 25%, so unclear on that
- 75% seems to be an understatement, if the work is really complete
- 100% is a great idea, but proofreading is ongoing for wikis (in the case of correctable text)
So there seem to be three categories at base:
- Incomplete text
- Complete text, incomplete formatting
- Complete text, complete formating
- I'm not against the idea of an improved quality&status rating system for content; can think of many different ways to go with that, haven't got deep enough into it to have a "best" choice. The present system is probably mostly borrowed from ws, at the time this place was started up; application of it has been pretty lax/hit & miss. We haven't really had the "critical mass" of ppl to operate such things. --Lx 121 (talk) 07:27, 15 April 2017 (UTC)
- For having 9,000 articles tagged, I would say that someone was doing a lot of work determining quality levels. If we simply work with what we have, collapse into three categories, and have some bots do category replacement, that would be an improvement. If bots could tag the top of a page in a given category with a visual indicator of the quality level of the work, then visitors (and editors) would have the information more visibly prominent. --Jeffmcneill (talk) 08:49, 15 April 2017 (UTC)
- did not know it was that many (you're right though, should have paid more attention to the numbers you posted; my apologies! had just gotten up, when i started on here). might be from "early days"; don't think anybody has been updating them in quite a while.
- I don't see a need to create a system that will be more complex, harder to use (and therefore avoided) and for what purpose? If we work with what we have, and simplify it, wouldn't that be better (better as in possible to implement quickly and be easier/simpler to use)? I'm suggesting options to fix something that seems to not work as well as it could. --Jeffmcneill (talk) 19:04, 15 April 2017 (UTC)
- Hi, I have always noted the completion status of the content I added. It is important to me than the reader is informed, specially if it was checked with a reliable source. Regards, Yann (talk) 08:44, 16 April 2017 (UTC)
Format Notations - Category ?
As mentioned above, Quality Notations does not indicate what kind of format a work is in, which could be:
- Full text or Image-based pages - HTML (wiki), PDF, DJVU, Epub, Mobi, or some other format
- Starting with full-text markup-based formats (html/wiki/epub/mobi), all other formats can (in theory) be generated. Certainly wikitext to pdf. - The same is not true starting with page-based formats (pdf/djvu) as even if full text, formatting results need to be completely redone manually. - Without the overhead of full OCR plus manual proofreading, it is not possible to get from an image-based format to a text-based format.
It would be great if anyone visiting Wikilivres could easily understand what format given works are available in. Besides file formats, there is the basic: read-online vs. download. The Archive.org is fairly good about showing different download file formats.
- Are there any categories or templates in-use that indicate filetype or file quality?
- For completed works that are in wikitext, is it generally the case that each chapter is a wiki article?
- Is there some kind of format that is generally used (or set of templates)?
- comment -- i think we basically just copied the process used by wikisource/en; but never had even their level of "manpower" to implement it. am certainly open to new/different approaches; haven't put in enough time thinking about it to have an opinion on "best" here either.
- clearly convertibility & downloadability are top considerations.
- except for page size limits, not clear what advantages there are to breaking up a work 1-chapter per page.
- Large files are also an issue when it comes to indexing a page (and also formating it for different filetypes, such as pdf, ebook, etc.). A book chapter (though the size of them varies), is a good breaking point. The epub format helps a lot in this. Each chapter gets an H1 (which is what each page on the web is supposed to have, one and only one H1). For search, Google doesn't want to index 50,000 words from a page (which is an average-sized book wordcount), but rather 2,500 words in 20 chapters, or something like that. Chapter-sized wiki pages are good for users, good for mobile, good for Google, and good for organizing book elements (for other formats). --Jeffmcneill (talk) 16:24, 18 April 2017 (UTC)
Copyright and Inclusion Policy
Hi, I've made a Flowchart on how to determine if a foreign work is in the public domain in Canada. This is my current understanding regarding foreign works and Canadian copyright policy regarding the public domain. This applies to all Berne treaty and WTO/TRIPS accord signatories, which is all but a handful of nations. Please let me know what corrections to make to this image:
Inclusion Policy on Licenses
Currently the following licenses are accepted: PD, FDL, CC-BY, CC-BY-SA, ArtLibre.
- GnuFDL actually has some problems with it especially with short works. There are two clauses that can cause a lot of limitation on derivative works. Others have pointed out these limitations, and for that reason WikiCommons recommends dual-licensing if using FDL. I think it is more trouble than it is worth. There are only a few dozen items labelled this way and in some cases the use of the category is not accurate to the license provided, or there is multiple licensing. Note that GFDL is incompatible (both ways) with the GPL.
- ArtLibre is Free Art (in English) and it is not as well known (in the English-speaking world), as it is a French licence (though translated). Looks like this has been applied to 8 transclusions.
There are a few more licenses that work with what this place is (Public Domain + Free License). Note that several of these come from software so they won't necessarily be applicable (though some like GnuGPL are). The main goal is to allow for material under different licenses to be able to live on Wikilivres (and vice versa).
When uploading a file, these are the current options:
- Public domain (author's life +70 years old)
- Public domain in Canada (author's life +50 years old)
- Public domain in Canada (shorter term)
- Ineligible for copyright
- Creative Commons - Attribution 3.0
- Creative Commons - Attribution-ShareAlike 3.0
- GNU FDL
- Not free
- Not in public domain
I think this is complicated, and a bit misleading. Either something is in the public domain, or it is not, and a license needs to be declared. In addition, the terms of being in the public domain only apply within a given country. So the fact that there may be a Life+70 public domain in a country somewhere, does not apply to Canadian law. I kind of get where this was going, where people could find out where they could use the work. But this is different than licensing.
Suggested (first step) simplification
- PD - Public Domain (work already in the public domain)
- CC-0 - No Rights Reserved (copyright holder has released all rights)
- CC-BY - Attribution (creators require attribution)
- CC-BY-SA - Attribution Share-Alike (Attribution + derrivative works require same license)
- GPL - Gnu Public License v3 or later + font exclusion (For those who want a Gnu license)
Here are some relevant links to the licenses:
- Public Domain
- No rights reserved (CC0)
- Attribution (CC-BY)
- Attribution, Share Alike (CC-BY-SA)
- GPL with Font Exception
- I am not a lawyer but I know that CC licences have some slight differences between thair issues, so eg. CC-BY-SA v.2, CC-BY-SA 3.0 and CC-BY-SA 4.0 are not the same licenses for the lawyers (there was rumore on Wikipedia because they changes licence from GNU FDL and CC-BY-SA 2.0 to 3.0 same time ago). What about PD: in my opinion it is important for people from others coutries to know is it PD-50, PD-60, or PD-70... or even it is PD-100 for people who lived in eg. Mexico. Electron ツ ➧☎ 10:09, 16 April 2017 (UTC)
- Public domain laws by country are different from licenses, so these should not be mixed up.
- I do understand the desire to help visitors be informed, but there is no PD-50, PD-60, PD-70. There is only the public domain and for which country/countries. Under different countries laws there are different durations. However, there are really a lot of exceptions and generally a lot of the bottom labels on the documents/authors are simply not correct. We can't mislead people, and we can't explain every single country for every given work.
- That said, there are a few things we can generally say:
- * Public Domain in Canada
- * Public Domain in the source country (if not a Canadian work)
- * Date of authors death (or if still alive), and whether that has reached Life+50 or Life+70
- In general there are really only two main terms: Life+50 and Life+70, then there is Life+60 (India + Venezuela), those handful of states that do less than Life+50 (Djibouti, Somalia, Yemen, Libya), and those that do more than Life+70 (around 10 countries).
- Canada simplifies things because basically everything becomes Life+50 (except for the four outlaw states), though this is no longer true for photographs.
- But my main point is that we cannot make statements that "works by this author are in the public domain in countries with laws that are life + XY years" (XY being a calculation based on this year minus death year minus). It is not correct in many instances.
- I think what would help would be a page that could be referred to regarding the law for the countries that Wikilivres editors are working most with, and a flowchart for each (on separate pages). Trying to cram that clarification into a footer doesn't work so well. --Jeffmcneill (talk) 16:19, 18 April 2017 (UTC)
- I realize this is a bigger issue and my ideas run counter to how things are done around here. I'd like to make this a project that interested Wikilivreans can discuss after I do more research. In the meantime I will take the suggestion of @Simon Peter Hughes: and look into other templates and/or creating alternatives. --Jeffmcneill (talk) 03:58, 19 April 2017 (UTC)
Proposal Change to Inclusion Policy
Hi folks, the discussion above became confused with what is public domain vs. the acceptable licensing. I'd like us to take on the second part first. I have a concrete proposal, as follows:
- Remove GFDL and ArtLibre licenses as acceptable (we can keep what we have)
- Add the following two licenses:
The main reasons are as follows:
- GFDL has some restrictions that could be problematic in practice (requiring some part of a document be reproduced, but the rest of it being able to be modified).
- ArtLibre is two-way compatible with CC-BY-SA, therefore redundant
- The ND and NC-ND are to allow works to be free in the sense of distribution, but not in the sense of derivatives, or derivatives + economic benefit. I do not see these as being incompatible with the Wikilivres project, since we are at core a distribution project. In this case, works that want to restrict derivatives, or derivatives and economic benefit, still offer free distribution. That should be encouraged rather than discouraged, in my thinking.
For the first five years of the project these licenses were acceptable.
Note that I do not recommend accepting CC-BY-NC-SA or CC-BY-NC. These two licences create the problem of orphan works as argued by Stallman. He makes a good case for avoiding licenses that restrict economic freedom, but do not restrict derivative works. The main point is that changes (and therefore authorship) can pile up on a work, and it is then effectively impossible to negotiate use rights that are more permissive. Works become *orphans* in regards to having no practically identifiable rights holder, while still being restricted in use, usually for 5 or more decades. Stallman's suggestion is to use ND when using NC, which means economic rights are a kind of derrivative work right, in practice. However, use of ND does not require NC, in the reverse situation.
Please your thoughts on this change. We would effectively be increasing what we would accept on Wikilivres. However, I do not see any kind of significant impact on this change, in terms of server resources. --Jeffmcneill (talk) 16:47, 22 April 2017 (UTC)
- comment -- in practical terms, most of our content on this project is likely to be PD, but "in principle" i would favour including/allowing the widest possible range of licenses that we can, within the parameters of our project "mission"/goals; so presumably open source is preferred, & the less user-restrictions the better. that maximises potential content; & end-users simply don't care about the administrative considerations for running the project, they just want access to content. --Lx 121 (talk) 16:10, 24 April 2017 (UTC)
- OK for me. Yann (talk) 10:25, 30 April 2017 (UTC)
- I don't know that I have a particular take either way but you know that we used to have NC and ND works here and then got rid of them which is why wikilivres.ru was founded, correct? Koavf (talk) 08:48, 5 January 2018 (UTC)
- Yes, I consider that a lamentable error of judgment. Can't unring that bell, but it doesn't mean we have to keep a mistaken policy. --Sysadmin (talk) 17:45, 6 January 2018 (UTC)
- Well, we actually can--we can undelete if we have the logs. Did that all get transferred? Koavf (talk) 20:36, 6 January 2018 (UTC)
- Yes, I consider that a lamentable error of judgment. Can't unring that bell, but it doesn't mean we have to keep a mistaken policy. --Sysadmin (talk) 17:45, 6 January 2018 (UTC)
More on Licensing and Compatability
As @Electron: notes, there are different, non-compatible versions of the Creative Commons license. I think there is no way around but to offer all of them as options. In general, most have the following versions: 1.0, 2.0, 2.5, 3.0, 4.0. Regarding Public Domain, there are basically three categories: Those works whose copyright has been waived by creators; those works whose copyright has expired; and those works which are ineligible for copyright (never were eligible for copyrighted in Canada). Creative Commons has two indicators for public domain, one which is the waiving, and the other which is merely a generic mark. However, jurisdiction is not included in the definition of the generic mark public domain, and therefore is only appropriate where a work is public domain in *all* jurisdictions. See: https://creativecommons.org/publicdomain/
To that end, Biblio.wiki does need (and generally provides) greater clarification. See below for basic text, which would then link (not currently) to descriptions of the licence:
Unique visitors to Wikilivres now 1,000+ Day
Moved to Bibliowiki:Site_Analytics
subpages for discussions
Folks, we shall soon get problems to follow all discussions. I think that insuch a small (!) project as Wikilivres we should concentrate all improtant discussions on one single page. We have at the moment some five to eight active users so we don't need a special page for this and that like on enwiki with some ten village pump pages. I would suppose to discus the category stuff here and not on Bibliowiki:Community Portal/Category Policy/En. as well as we do not need special subpages for English - Russian - French like supposed on Bibliowiki:Community Portal/Category Policy - the discussion will get broken (as the experience shows even this main talk page Bibliowiki:Community Portal/en exists in one language only and this is good). Cheers -jkb- (talk) 10:17, 24 April 2017 (UTC)
- Points taken. However, let's be clear about the intent here (though certainly the naming conventions can be changed). There are few if any policies written down, and those that exist are stretched over long periods of commentary in dozens of pages throughout the site. This is fine for those who have been here for a long time, but does nothing for new visitors. What is needed are a few very clear documents that lay out the functioning policies of this site. Otherwise what happens is that people violate these unstated policies, have their work deleted, and only then are things discussed or explained. On Wikisource there is on the Scriptorium a set of core policy documents linked at the top. The idea that Wikilivres can operate with a single policy document (the inclusion policy) is simply not realistic, nor kind to others.
- I am happy to have fewer places to discuss things, and also have voiced such requests. @-jxb-: please help clean up https://wikisource.org/wiki/Wikisource_talk:Wikilivres. There are several items with requests for feedback and comment that have very little to no response on this page, probably because one long page is very difficult to parse when it comes to keeping up and responding. Having discussions on wiki talk pages is difficult, as the technology does not fit the user needs. I understand perhaps folks have nothing to say, one way or the other. That's ok too.
- We do need some basic policies (which will be their own pages), and we can discuss these together, to whatever extent people want to be involved. Discussion can be on this page or those policy pages.
- There are two important issues currently, which are:
- * An improved copyright policy to be more inclusive, specifically with regards to CC-BY-NC and CC-BY-ND-NC (please provide feedback)
- * Some kind of consensus on a new domain name. I accept that it may not be possible to reach consensus.
- These two are significant, important, necessary changes. The first is to make a stronger claim and differentiate from the larger Wikisource and Gutenberg Canada (and get back to the roots of Wikilivres); and the second simply because there is a lot of work ahead to shift off of the current domain name, which we don't and can't control.
- It is important, at the very least, to document current practice regarding policy (such as categories, for example). Everyone is busy, believe me I get it. Documenting policy is meant to save time in the near and medium-term. If a few new pages of policy crop up, feel free to discuss or ignore, or discuss in the future. They are needed. The opportunity for discussion will be announced on this page in any case. --Jeffmcneill (talk) 19:10, 24 April 2017 (UTC)
support creating clear policy docs; but i will have no time to join the discussions until next weekend. have just finished uploading all the pd laura ingalls wilder 'little house' books, & that is it for my week here; i probably won't even have time to add more buck rogers comics until saturday. Lx 121 (talk) 19:25, 24 April 2017 (UTC)
- One point: there is just one and only Scriptorium on Wikisource for all community discussions. On the top are linked documents not talk pages - that's all right. But I think it is very dangerous to split the dommunity discussion to sevedral subpages (according the themes) and then more over to different language talk pages - aqnd do not forget: Wikisource is bigger than Wikilivres and it has much more users than we have. I work in such projects since 2004, Oldwikisource 2005.
- To the page Bibliowiki:Community Portal/Category Policy/En etc.: if and when we have a new policy for categories (after discussing it here) we can create the page anyway, but in this case I'm not sure if we need something like that. Categories have their functionality on every WMF project (where we are coming from) and it is hard for me to imagine that we should have somedthing speciasl they do not have. More important is to create content in the right way (hm and sorry, the pages from today with some 300,000 bytes and more, not formated, they are not good...).
- Regards -jkb- (talk) 21:57, 24 April 2017 (UTC)
- comment -- & that is what we have "quality ratings" for. also; this is how collaborative-projects, like a wiki operate. S.P.H. & I are co-operating to add books. between us, we now have the complete works of ian fleming, & all available (all PD) works of laura ingalls wilder. 2 very important & popular writers in english. & we are the first & so far the only website that has their complete PD collections. Lx 121 (talk) 16:07, 26 April 2017 (UTC)
spam accounts -- RESOLVED
Spam accounts seem to have found us; beside the new antispam tools we should need sooner or later the check user extension which is quite useful here. @Jeffmcneill:, do you think we can have it some time? Cheers, -jkb- (talk) 19:47, 24 April 2017 (UTC)
- addendum -- we also need to draft a proper "block notice" to post @ blocked user accounts. Lx 121 (talk) 19:49, 24 April 2017 (UTC)
- Some of the old blocked users you can see here ->  or in Category:Blocked users. But most of these acconts are not register now... Maybe we should make an abolition, and delete the messages of be blocked because they are not registered now and not bloked actually. Electron ツ ➧☎ 22:41, 24 April 2017 (UTC)
.DJVU format drawbacks and alternatives
The .DJVU format is quite good for what it is intended, namely an image- and page-based archival format. However, there are significant problems when it is used on a wiki, as follows:
- When used to do manual optical character recognition (by a human typing) it is reasonable (with the proofread extension)
- However it is generally large
- The format is akin to ZIP which is read by MediaWiki, which then produces thumbnails for each page image, which dramatically inflates storage space
- While there are many applications that support .DJVU, it is not a popular format in the ebook marketplace, and requires software installation to use (none of the major ebook vendors have this as a standard file format: Amazon, Apple, Google, Kobo, Nook, Gutenberg, Archive.org (discontinued .DJVU file generation; none of the major hardware ebook readers support this format).
- The main problem is that it is not an ebook text file format, but rather a page-based, image-based file format (akin to pdf).
- Conversion from DJVU to epub or mobi is not possible, or generally successful (akin to PDF to epub/mobi).
As an example, Ernest Hemingway's Across the River and Into the Trees is 11.8mb in .djvu, add an addition to 185mb in thumbnails generated by MediaWiki. At the same time, .epub, .mobi, and .pdf files for this same work, combined are only 2.4mb.
Epub files are essentially html files (with css formating and images) and can fairly easily converted into wikitext (using tools such as Pandoc.
I understand the .djvu file editing workflow is one that has been around for a while. However, if epub files are available for a given work, those djvu files should be removed and replaced, for a dramatically improved user experience, for example Across_the_River_and_Into_the_Trees#Download. --Jeffmcneill (talk) 16:52, 2 May 2017 (UTC)
- comment -- tried to get pandoc working on my ubuntu system; nada, thus far. Lx 121 (talk) 07:47, 6 May 2017 (UTC)
- Jeffmcneill: DJVU files can't be replaced by epub files. These are different formats for different purposes. DJVU files usually contain images, while epub files usually contain text. DJVU compression can be defined on case by case, so DJVU can be very big for high quality (300dpi or higher) and high resolution, while it can be much smaller than the equivalent PDF files. Regards, Yann (talk) 18:48, 11 May 2017 (UTC)
- @Yann: yes I agree that these formats are for different purposes, and it is true that DJVU is a format which is much smaller than PDF for high quality page images. It is also true that DJVU cannot be converted into EPUB (page vs. image-based formats, as mentioned above). However, if there is an Epub file for a given work, and any images in that ebook were high quality, but all text was text (and not an image of text), then that would be desirable format to have, in terms of usability and in terms of server resources. I'm open to debate on this, but I just don't see any reason to keep a DJVU when Epub and PDF are available, given that the DJVU has image files of every page (and not searchable text) and the Epub and PDF files are actually mostly text, as in the example Hemingway mentioned above (Across_the_River_and_Into_the_Trees#Download). --Jeffmcneill (talk) 04:34, 12 May 2017 (UTC)
- @Yann: yes, I get that. But the DJVU files seem to never be removed. The Proofreading structure is kept in place. The Hemingway I mentioned was created in 2012 and split into Page: namespace by a bot. Maybe I just have a problem with the way the proofreading structure works. If one is reading this page: https://wikilivres.ca/wiki/Across_the_River_and_Into_the_Trees/Chapter_I and they try and click the Edit link, they get this:
<div class='text'><pages index="Hemingway - Across the River and Into the Trees.djvu" from=12 to=18 header=1 /></div>
- @Jeffmcneill: This is how it works on Wikisource. The text is transcluded from the Page namespace to the main namespace. I had a discussion with Thomas (Tpt) who maintains the Proofreadpage extension about this issue. He suggests to extend the VisualEditor to provide an easier edit interface for this. Regards, Yann (talk) 14:36, 17 May 2017 (UTC)
- @Lx 121: there are packages for Ubuntu and most Linux as well. May need to install LaTeX or even full TeX as well, for full PDF support. Even without that, though can test things like markdown to html, or wikitext to epub. I use CentOS so `yum search pandoc` shows several distributions. See also http://pandoc.org/installing.html which mentions an updated Debian distribution on their download page. --Jeffmcneill (talk) 04:40, 12 May 2017 (UTC)
- @Jeffmcneill: -- have done all of that, have installed everything i need to install (afaik), STILL doesn't work. im on ubuntu 16.04 lts, if you can rtell me what i need to dop or add or etc. or what thing i need to do to "summon" the program to appeaar, once it is installed, please help? xD because i've tried everything & GOT NOTHING for it >__<!?
- being able to convert into mediawiki format would be lovely, & it would solve most of my problems. mainly working with epubs atm. anything you can suggest?
comment -- on the subject of file formats, we should try to offer end-users the best variety we can. priorities come down to 2 or 3 things:
1. the qualities/merits of a particular file format -type; which is "best", & best @ what/how?
2. end-user demand; what file types real-world endusers need/want/use.
3. ease-of-use & compatibility with mediawiki; needs to work on our site, & must be easy & obvious for endusers to access(!).
First of all, thank you very much for revitalizing wikilivres.
I understand, there are problems/questions around the Project RE, and have created a short description as Category talk:Paulys Realencyclopädie der classischen Altertumswissenschaft (for further info please read User talk:K67y). To have here the page images referenced from german wikisource is essential for that digitalization project. Therefore, if resources is the main problem, any compromise offer would be an invaluable help, e.g. limiting the number of images, excluding them from the backup process etc. K67y (talk) 21:04, 5 May 2017 (UTC)
- This project has largely stalled, with a single user uploading many thousands of images years ago, and a few pages of text editing. This is not the appropriate place for that sort of project. Amazon S3 (with Canada locations) might be a better match, or other Canadian-located wiki projects. Trying to host gigabytes of page images on this site is definitely not recommended, and hardly justifiable. Between search engines trying to index these pages (both the images themselves and the image pages), and thumbnail generation of each and every page, our server became overloaded. This is just not a sustainable kind of cache of page images. --Jeffmcneill (talk) 04:24, 12 May 2017 (UTC)
- Just to avoid the possibility of misunderstanding: does the above position mean, that the allowed number of RE-scans is zero and the allowed disk space is zero and this position of the person who pays for the site is not open to argumentation (although your formal concerns - if intended - could be addressed by resource usage limits, by disabling thumbnail generation, by excluding robots from these files/pages and so on) ? K67y (talk) 00:11, 15 May 2017 (UTC)
- The RE project has caused a lot of downtime for the site. The issue is more time (sysadmin time) not only money. I don't see any money coming from the RE project (perhaps there was in the past). I don't see any sysadmin time coming from the RE project to support the site. All I see is a dump of files, a few edited pages, not very usable to anyone but perhaps the RE project people who apparently access the files from some other site?
- Disabling all thumbnail generation is a non-starter as thumbnails are useful in the system. It is not possible (that I have found) to disable thumbnail generation on DJVU files/pages only. This requires manual intervention or scripting (but not only for RE but all DJVU files). Resource constraints don't really work in this case since the problem has to do with what is inside the DJVU system and how mediawiki interacts/reacts to it. Really the DJVU/Mediawiki/Editing system is a huge somewhat broken hack that won't ever be improved, and needs to be replaced with something better. For those with DJVU formated texts, this is the only viable system, but it never worked well, from a sysadmin perspective.
- I'm definitely not interested in the RE content returning to the site. No resources are being proposed for it, and there has been zero financial contribution to wikilivres/bibliowiki (and we don't ask for much). Remember, this is not a Wikimedia Foundation project with deep pockets and many paid developers and sysadmins to make everything work. --Jeffmcneill (talk) 07:58, 28 June 2017 (UTC)
Wikilivres.ca domain will DROP - Next Steps
- Moved to the new Wikilivres page (archiving the domain change discussion)
Dead Man's Switch
Hi folks, a simple mechanism to deal with the incapacity or demise of yours' truly, is a service that will send email in the event of my non-response (to emails). This service looks useful: https://www.deadmansswitch.net/ Since it sends email after 60 days, I will need to ensure that I pay ahead on the various services I use 90 days in advance. Also, I'll need to create separate accounts for these, which are mainly: Hosting, DNS, Domain Name. Thoughts? --Jeffmcneill (talk) 08:01, 12 May 2017 (UTC)
- comment -- well i hope you don't die on us! :/ but it's not only a matter of the technical aspects (being able to access the sysadmin controls, etc.). the are the legal "ownership" issues. in particular, who "owns" the rights to the name "biblio.wiki" (as regards domain name registration, & longer-term as "intellectual property (ugh!)). if i understand our present situation at all, the key problem we have RIGHT NOW is that eclectiology was the registered owner of the registered site name 'wikiliveres.ca', & that we have no way of transferring that "ownership" to anybody else, after his death. we need to fix that problem, as well as the technical issues of sysadmin/operations. --Lx 121 (talk) 10:10, 12 May 2017 (UTC)
- Wikilivres.ca is for all intents and purposes, not ours and will not be. It is only temporarily pointed at our nameservers, hence it currently works. However, there is nothing more we can do, and it is not worth anyones time to discuss it further. Hence the new domain name decision.
- Of the three main pieces -- domain name, hosting, and dns -- the domain name is the lynchpin, as everything else can be recreated/rebuilt. Though ideally you would not want to go through the process of rebuilding, believe me! So the hosting is also very important for practical reasons (configuration details, how things work). --Jeffmcneill (talk) 02:53, 13 May 2017 (UTC)
- There is free option there. Maybe we can go this way?
TwoFour recepients seems like enough? -- VadimVMog (talk) 18:30, 13 May 2017 (UTC)
Expense Report for last 60 days
Purpose and Criteria for inclusion in Biblio.wiki
I've realized (after stating otherwise) that from the beginning and until today the criteria for inclusion in Bibliowiki is:
You are welcome to publish files and texts here if they cannot be accepted in Wikimedia Commons and Wikisource. Works that can be accepted at those projects should be published there. - Main Page
I think I've been blind to this. While clearly this is the original mission (and was included (in slightly different language) when Wikilivres was created), it seems to me that this is too narrow a focus, especially since Bibliowiki has no relationship, sponsorship, ownership, or support from the Wikimedia foundation. Also, as mentioned elsewhere, I am uncomfortable in having content evicted because it is found elsewhere on the Internet (and presumably, nearly all content would be accepted in WC/WS at some point in the future).
- i'd favour broadening the mission-scope & objectives; & removing the "either/or" conditional statement re: wikimedia projects. a practical consideration & limiting factor is the cost & capacities of our site & project, though. we need to have the resources to host the things that we want to host; & we need to host the things that we have the resources to host...
- & we should definitely work to keep a good & close working relationship with the WM projects. Lx 121 (talk) 11:51, 28 June 2017 (UTC)
- It is safe to remove such restriction, I think. Why not? The only true restriction is how much server storage we have, but I don't think we can spend it all - there are too few edtiors working here. If/when we have more editors, let's hope money (storage) will grow too. -- VadimVMog (talk) 12:26, 29 June 2017 (UTC)
- I think that broadening the scope can provide a sense of security to editors who put in effort to place creative works in Bibliowiki. Editors would not have to worry about moving the curated works elsewhere in the future (a has happened in the past). This is along the same lines as inclusion of more creative commons licensing scope (the ND and NC-ND provisions). Not a huge change, but one that could be meaningful to some editors and collections.
Bibliowiki Namespace Impact
Hello folks, we've cutover to <https://biblio.wiki>. This also entails a change of the main namespace from Wikilivres to Bibliowiki. Some links (such as to this page) were originally, for example, https://wikilivres.ca/wiki/Wikilivres:Community_Portal and are now https://biblio.wiki/wiki/Bibliowiki:Community_Portal.
- The namespace was always programmatic so all page names are automatically changed, but links to those pages are not (and will have red underlines except on a few prominent pages I've made redirects to).
- Also, I've got the webserver so that it should automatically reroute to the new namespace, but it won't when the redirect is something like: https://biblio.wiki/index.php?title=Wikilivres:Site_support&action=edit&redlink=1
- Here is the list of pages that are in the Bibliowiki namespace and may (likely do) have links to the Wikilivres namespace. For each page linked there, an alternate page with the namespace of Wikilivres should most likely be created, with redirects to the Bibliowiki namespace pages.
- Anything else broken because of this? Please report.
- For those admins who have the ability to adjust our name and address in the interwikis (from wikilivres.ca/wiki to biblio.wiki/wiki) that would be very helpful.
- Note that I will take care of the main namespace pages, if other admins could handle the Category:Wikilivres categories.
query - can we run a bot or something, to fix/update the in-wiki pagelinks? (or did i misunderstand the issue?) also, how long will the redirects from wikilivres "last"? Lx 121 (talk) 17:28, 3 July 2017 (UTC)
- I think @Electron: and his bot just fixed all the links. ^_^ As for Wikilivres.ca inbound traffic, this will redirect as long as the domain name is registered, and the current registrar points to our DNS records for name resolution. This was the generous act of the registrar when appraised of the situation. Under the CIRA (https://cira.ca) rules, they cannot change the owner/account of the domain, but they were able to update the nameserver records to my DNS service (Amazon Route 53). My guess is this will persist for the two years that @Koavf: renewed the domain for (around Jan/Feb 2019?). In the meantime we can contact anyone with links to make the changes. Most links should change automatically if they were using Interwiki links. If not, then the various Wikis, Wikisources, Wikicommons, and Wikibooks can be edited by ourselves. I can produce a list of inbound links from Google Webmaster tools. --Sysadmin (talk) 09:02, 4 July 2017 (UTC)
- Hi, Should we change on Wikimedia projects the shortcut [[wikilivres:]]? Regards, Yann (talk) 16:26, 4 July 2017 (UTC)
- Thanks @koavf:!
Electron and I have added the list of previous admins on Wikilivres here. Where will we keep this list? Any opinions?
Regards and thousands of thanks to all who worked for this project.
Logo - transitional implemented -
OK, my dream to rescue and preserve Wikilivres has gone... :-) But: in all WMF projects it is still known as Wikilivres, no one (ore nearly no one) knows what bibliowiki is. Wouldn't it be an advantage, if our logo (top left) would show not only Bibliowiki, but Wikilivres as well - sure: in a quite smaller letters in the second line, might be as formerly Wikilivres... ? Just an idea. -jkb- (talk) 23:05, 8 July 2017 (UTC)
- comment - i have no objection to this as a "transitional" logo; i think it's a good idea, but somebody else can do the graphics, i have no skills @ that. what about adding a small maple-leaf too? we are supposed to be a canadian website. Lx 121 (talk) 02:34, 12 July 2017 (UTC)
- @-jkb-: I've updated the profile image to include "formerly Wikilivres" (refresh browser if not visible). Transitional for 12 months ok? --Sysadmin (talk) 14:04, 13 July 2017 (UTC)
- @Lx 121: I'm not sure this is a Canadian website. It is definitely a website located in Canada. Perhaps a fine distinction? Based on content and visitors, I'd say this is an International website. Canadians, Canadian content, and Canadian editors are an important, but not predominant, part of the site. --Sysadmin (talk) 14:04, 13 July 2017 (UTC)
Threats to Canada's Life+50 Copyright Laws
do not want to turn us into a political chat group, but this item discusses the future of canadian copyright & summarises concerns, & where ongoing/upcoming trade negotiations might take us. stuff we should be aware of, as it affects the future of our project. http://infojustice.org/archives/38370 --Lx 121 (talk) 23:40, 11 July 2017 (UTC)
- I was reading something similar a week or so ago. These are only lobbyist proclamations. They are trying to pressure Canada, but there is no basis in law for demanding what they are, and in some cases the demands go beyond US law. See: https://www.techdirt.com/articles/20170309/07133936877/canada-says-it-wont-attend-special-301-hearing-because-ustr-prefers-industry-allegations-to-facts-data.shtml
- Due to the much more liberal (Liberal party) leadership in Canada, don't expect copyright extension anytime soon (unlike the former Harper govt'). Partly one big reason, besides everyone hating Trump, is that it would cost something like $4bn ($450m CAD/year) to Canadians to extend copyright by 20 years. See: excesscopyright.blogspot.ca/2017/01/welcome-mr-walt-disney-to-canadian.html
- Basically, while it would impact us (we would have to move servers), Canada is a convenience. Those editors who are Canadian residents and citizens should definitely take up the cause, but the rest of us are foreigners who can only make encouraging noises from the sidelines.
- The light blue countries on the map below are Life+50. Compare this with the AWS datacenter locations. Currently South Korea looks to be the best fit. They also have Life+50, rule of the shorter term, and their 1986 revision included exclusions for already public domain works remaining so (akin to Canada re: US 1989 Berne ascension). Another option is Hong Kong, though I haven't looked in detail at their copyright laws.
- Note, I've confirmed that Belarus also has very good Copyright laws (possibly the only one in all of Europe). It has successfully resisted its CIS brethren's life+70 changes. It has life+50, rule of the shorter term, and no copyright on already public domain works. Also, it aceeded to Berne with entry into force on 12 Dec 1997. Belarus has a variety of VPS options, and they generally tout having very fast speeds to Ukraine and Russia, as well as Western Europe. --Sysadmin (talk) 07:24, 15 July 2017 (UTC)
- I've looked further into this issue, and it appears the TPP is not dead, and countries such as Japan and New Zealand have already ratified it, which would include changes to copyright law (increasing to life+70) for both of these countries. Japan especially is spearheading the revival of TPP (which could not come into effect without the US, under current law). This means that we need to be aware there is a non-zero chance, between TPP and NAFTA re-negotiation, that Canada would change its copyright laws to go along with the US and the other TPP countries. --Sysadmin (talk) 08:10, 15 July 2017 (UTC)
- FYI, besides Japan signing the TPP, their current Life+50 also includes additional years added to WWII Allies publications for those years that were not provided copyrights, which essentially extends some countries by 10 years or more. This makes them effectively Life+60 for all intents and purposes.
- Another note, Hong Kong looks quite good. Their most recent laws are Life+50, they have rule of the shorter term, and previous to their recent laws they used the British Copyright laws of 1956 which is also Life+50 and rule of the shorter term. https://www.elegislation.gov.hk/hk/cap528?xpid=ID_1438403328195_003 --Sysadmin (talk) 16:07, 21 February 2018 (UTC)
I've added a donation page and included it in the sidebar. I've also expanded the part about being independent on the main English page:
- Bibliowiki is supported solely by the donation of time and money by its editors, and is not affiliated with, owned, or supported by the Wikimedia Foundation or any other organization. Donate to Bibliowiki today. All funds donated go directly to paying for the hosting costs of this website.
Could editors who can translate this into our local languages do so and put this on the home page of each of the local languages on Bibliowiki? Suggested changes to this text are welcome as well. Thanks to all efforts of the editors, financial and otherwise! --Sysadmin (talk) 04:57, 17 July 2017 (UTC)
Cannot specify email address
- Hello @Ineuw: email is currently not available. This is on the to do list. --Sysadmin (talk) 04:10, 12 August 2017 (UTC)
No magic, a simple explanation?
Surprising result when clicking this image of a sailing ship, a portrait is seen: the reason might be it's due to two files with a same name. But I can't find where these two files have their respective locations, can you? --Zephyrus (talk) 00:02, 2 September 2017 (UTC)
- I see just Joseph Conrad, only. Maybe yours browser's memory is not refreshed? Electron ツ ➧☎ 08:50, 2 September 2017 (UTC)
- I also see a sailing ship on this page. I can tell you that the portrait of the writer is the locally uploaded file (File:Joseph Conrad.jpg) and the photo of the ship is the name of the file on Wikimedia Commons (commons:File:Joseph Conrad.jpg). Anyway, that means that I wasn't the only person seeing a picture of a ship on the Joseph Conrad page. So I think I was justified in changing the picture. Simon Peter Hughes (talk) 12:59, 2 September 2017 (UTC)
but what colour is that dress!? :P & presumably this is a software conflict btwn the local file & the commons file-fetcher-thing? we should probably set it to prioritise local files? 16:54, 2 September 2017 (UTC) Lx 121 (talk) 16:54, 2 September 2017 (UTC)
Same problem here. I see this image:
instead of this one:
Lots of Spam - Change User Creation Process? FIXED
Hi folks. I now see lots of spammers who have created accounts and been blocked within the last 30 days, and only one active new user in the same situation. If we change registration to one "by request" that would cut down on the spam, though it might be a barrier for potentially active new users. I think it is worth trying out, make people request an account by sending an email to email@example.com and I can deal with those requests (creating accounts as needed). My guess is it will be much less work creating legitimate accounts than banning illegitimate ones. Thoughts? --Sysadmin (talk) 11:55, 28 October 2017 (UTC)
- OK. If we don't have powerful anty-spam automatic tools instaled yet (see eg -> , ), it is a good idea. Electron ツ ➧☎ 11:28, 29 October 2017 (UTC)
Maria Montessori Works - Public Domain Status?
Hello folks. For anyone with thoughts on this, I'd like to present the case of Maria Montessori works and their public domain status. Montessori died on 06 May 1952. She was an Italian citizen for her entire life, which holds Life+70 = 01 Jan 2023. However, just before, during, and after WWII she lived in India (was actually not allowed freedom of movement because British India considered Italian citizens potential risks). She published several works in India by Indian publishers during that time. India has Life+60 as a copyright length. At least in the case of Britain, the location (and not nationality) of a works first publication is where the copyright law is determined, and they recognize the copyright length of India via the Berne convention (minimum Life+50) with rule of the shorter term. This is largely the same situation for the EU in general and also the US, etc.
In Canada, copyrighted works in India, as elsewhere get Life+50 (rule of the shorter term). From what I can tell with this situation (and note that these works I am referring to are in English, either originally or in translation from Italian and French), all of these works published in India (between 1947-1949) should be in the public domain worldwide.
- OK, but if it is a translation, its copyright status also should be chacked. The translator owns his/her separate copyrigts to the translation. Electron ツ ➧☎ 11:35, 29 October 2017 (UTC)
Beatrix Potter to be featured on English Wikisource in December 2017
Hi! On enWS we are featuring the works of Beatrix Potter for the month of December. One of these works, The Tale of Little Pig Robinson, is hosted here on Bibliowiki. Consider yourselves warned :) On enWS, we usually protect featured works against editing by non-admins due to the higher risk of vandalism, and you may want to consider doing something similar. Beleg Tâl (talk) 19:48, 26 November 2017 (UTC)
- I have protected The Tale of Little Pig Robinson so that only admins will be able to edit it for the next two months. Simon Peter Hughes (talk) 03:59, 27 November 2017 (UTC)
Fichiers DjVu manquants
- Sorry to answer in English but I don't speak French. There are not these files in the wikilivres dump I have (I mean wikilivresca_w-20170218-wikidump.7z taken from https://archive.org/download/wiki-wikilivresca_w). The djvu files are fineshed on the letter T, if they are sorted in an alphabetical way. Something went wrong during the dump process, I afraid. Maybe the previous file dump (I mean: wikilivresca_w-20170213-wikidump.7z) consits of the missing files? But I have no possibility to upload this file, because I have weak links to the internet, at the moment. So I anybody can chackek this (and upload them manually if they are here), it would be apreciate (it is about 5.7GB to be upload). Electron ツ ➧☎ 14:58, 21 December 2017 (UTC)
- Btw. There are also much older files:
- wikilivresca-20121112-complete.7z (3.8GB), available here -> https://archive.org/download/wiki-wikilivres_ca
- wikilivresca_w-20140110-complete.7z (5.0GB), available here -> https://archive.org/download/wiki-wikilivres_ca_20140110
- to be checked.
- At the beginning, I would start with the latter file... But personally, I have no possibility at the moment, as i said. Electron ツ ➧☎ 15:13, 21 December 2017 (UTC)
- Sorry I don't know english, only french.El Verdugo (talk) 10:52, 22 December 2017 (UTC)
- OK but it's a good idea to try (C'est une bonne idée d'essayer) -> https://translate.google.fr Electron ツ ➧☎ 01:07, 30 December 2017 (UTC)
- I have uploaded some missing files from / J'ai téléchargé des fichiers manquants de -> https://archive.org/download/wiki-wikilivres_ca_20140110 (wikilivresca_w-20140110-complete.7z) Electron ツ ➧☎ 10:53, 10 January 2018 (UTC)
- Sorry I don't know english, only french.El Verdugo (talk) 10:52, 22 December 2017 (UTC)
Taille maximale du fichier
Admin/Bureaucrat Requirements/ Hosting Expenses -- RESOLVED
For 2017 we've received 14 donations from 9 individuals (three of these from sysadmin) to support this site in 2017. Total raised was $147.07 and spent was $134.02, with a remainder of $13.05. Sysadmin has donated $52.55, slightly more than a third, and for the past five months donations have been self-sustaining. This is due to new donors nearly every month, and two donors (besides Sysadmin) who have made multiple donations.
Many editors with Admin rights have made a donation to Biblio.wiki, some have made several donations. However, some with Admin rights have not donated. To this end, I'm proposing that all with Admin rights have made a $10 donation within the past 12 months. If Admin privileges are not seen as valuable, then that editor can function effectively with standard editor rights. However, if Admin rights are deemed valuable, then surely contributing to the ongoing website hosting costs is a legitimate request.
Another opinion might be that those with Admin rights are usually providing in-kind donations already through their efforts, and should not be required to help sustain the site financially. However, as a good number of Admins have donated, that is not a comprehensive belief.
I'm not suggesting that standard editors or visitors are required to donate, but rather, those with additional rights (with rights come responsibilities). Also, the idea is not for this site to make money, but rather to sustain itself through voluntary contributions of its editors and user community.
There will be an increase in hosting expenses coming in 2018, with movement to the AWS platform. Biblio.wiki will be sharing an EC2 t2.medium instance, and will have elastic IP, data storage, backup, and data throughput expenses. I'll cap this on my side at $20/mo. While it is 2x what we pay now, the network reliability and ability to do a disaster recovery/restore, and also faster network access helps manage risk better.
Moving to AWS as a host (located in Montreal) is one part of more efforts I aim to put into the Biblio.wiki site. We have ongoing stale functionality requirement improvements needed. That said, we've come a long way from the site having disappeared. I'll be writing up an end-of-year synopsis detailing the technical issues and a roadmap for 2018.
- Please, give my tools to El Verdugo, in case it may be useful. And don't suppress Yann's rights, that would be very shocking. Thanks. --Zephyrus (talk) 05:30, 30 December 2017 (UTC)
Admins / Bureacrats for 2018
Hello folks, the following are the current rights assigned for the beginning of 2018. Akin to WikiMedia projects, people who are not active do not need to retain advanced rights in the system. In the case of Biblio.wiki, as a self-funded project, activity includes some modest amount of financial contribution. Those who are unable to contribute financially may of course ask others to do so on their behalf.
- Admins: -jkb-, El Verdugo, Electron, Koavf, Simon Peter Hughes, Sysadmin
- Bureacrats: -jkb-, Electron, Koavf, Sysadmin
Expenses - Spreadsheet
The file is located on Google Docs. Below is a snapshot.
Request to delete the page
moved to -> Bibliowiki:Possible copyright violations#Request to delete the page Electron ツ ➧☎ 10:41, 30 December 2017 (UTC)
Happy New Year!
Bibliowiki:1967 deaths, therefore now in the Public Domain. Dig in. Koavf (talk) 03:09, 1 January 2018 (UTC)
- FYI, I've added links to four additional sites with author who died in 1967. --Sysadmin (talk) 14:50, 2 January 2018 (UTC)
DJVU and PDF file management proposal
Hello folks. I am keenly aware of the need to have as much Wikisource functionality as possible, as this is the core use of Bibliowiki. To that end, the ability to do proofreading/editing with page images next to a text editor is important. Also, I understand that PDF support of this same proofreading/editing is important (and currently not available). At the same time, we experience ongoing issues with DJVU (and PDF) files due to their file size and the creation of multiple thumbnails per page.
There may be a solution which modifies somewhat the workflow but has the advantage of preserving long-term access to the DJVU and PDF source files. This is based on the actual use of DJVU files, which fall into two categories:
- Those which are uploaded and fairly quickly a complete wikitext version of the text and images in the source file are created; and
- Those which are simply uploaded and left on the site.
For both of these, the DJVU/PDF source files actually need not live for long on the Bibliowiki server. In the first case, once the pages are created and proofread, the file can be moved offsite to a much less expensive location, but still be linked from the Bibiliowiki pages, and accessible. In the second case, the same situation holds. That is, the DJVU files are not actually used as such for the proofreading/editing process. When those files are not needed, housing them offsite is a very reasonable approach (in terms of actual monthly cost, and in terms of lightening the load on processing and storing thumbnails on the Bibliowiki Mediawiki installation.
This will require a slightly different workflow, but it should have no real impact on the vast majority of what takes place on Bibliowiki. Here is a suggestion:
- DJVU/PDF files can be uploaded. Once uploaded, editors can tag them as being intended as an archive only, or intended to have wikitext/images proofread/edited.
- If the files are tagged as archive only, they will be moved to an offsite storage and linked from there.
- If the files are tagged as to be edited/proofread, or if no preference is stated, then it is assumed that they will be edited/proofread and a six month timer will begin.
- If after six months little or no progress is made, the files will be treated as archive-only (as above).
- If a modest amount of progress is made, the editors doing such work (and all other interested parties) will be contacted and an additional six months will be allotted to complete the work.
- Once a work is complete in terms of creating wikitext/images, the DJVU/PDF source files will be moved and linked as per above. Also, the intermediate page images/thumbnails will be removed and the proofreading special pages.
Any and all comments and suggestions are welcome. The above has the benefit of saving us from deleting actual source files, while preserving the more expensive and rare resources of the functioning server. The offsite file storage will be on Amazon S3, and files will be immediately accessible via hyperlink when needed. These files will be tagged so that spiders do not index them, and create excessive additional demand for them on the general web.
- Looks good to me. — VadimVMog (talk) 03:47, 23 January 2018 (UTC)
- I respect that you want to defray costs but this seems like a very cumbersome process. To put a finer point on it, how much money do you think would cover the costs of upgrading the hosting to allow for indefinite storing of source files? Koavf (talk) 06:34, 23 January 2018 (UTC)
- Costs to keep all those files on a primary storage EBS are 0.11/gb/mo in Canada Central region, and S3 is 0.025/gb/mo in Canada Central. There are also some costs for puts and requests for S3. However, costs are magnified in EBS because of the need for snapshot backups which effectively double the price (to 0.22/gb/mo). This means S3 is nearly 10x cheaper for storage (while still having the convenience of having files immediately available as needed). Our current EBS is sized to 32gb, and the idea is to not exceed that (while allowing for normal growth of the projects).
- Part of the point is that files are not being accessed, but taking up a large amount of space (relative to other files). This is because of the process of generating multiple thumbnails of different sizes for each actual page in a PDF or DJVU file, in addition to the actual file itself. This means that a given DJVU, say any of the 20-50gb files of the Collected Works of Gandhi, can produce up to half a gb. These are thumbnails that have never been used in the editing/proofreading process.
- The most recent December request for donations, which yielded zero from everyone contacted directly, is an indication of the kind of financial support this project has. Things need to be done as cheaply as possible, while retaining as much functionality as possible. Also, the generation (and re-generation) of thumbnails is processor intensive and has on two occasions in the past created a hung server, which became unresponsive, and needed to be rebooted. --Sysadmin (talk) 07:40, 28 January 2018 (UTC)
- OK. I understand the problem. The proposal is not bad from the technical point of view. But what about copyright problem? Amazon is an american firm and storing "our" files there may be an infringement of copyright from the American point of view... They are PD in Canada, but usually not PD in USA. Electron ツ ➧☎ 11:23, 28 January 2018 (UTC)