Visit the Community Portal Archives
The following duscussion has been copypasted from User talk:-jkb-/Temporary Scriptorium where the community discussion took part from March 31 2017 to April 6 2017 when this "traditional" talk page was not available. For the history of the provisory talk page see here. ---jkb- (talk) 17:41, 6 April 2017 (UTC)
- 1 Provisory discussion started when migrating
- 2 Discussion started on the new site .io/.ca
- 2.1 Move to Discussions?
- 2.2 On Quality at Wikilivres
- 2.4 a suggestion for something new for the main page (EN; though certainly not opposed to expanding it to other langs)
- 2.5 page-sectioning
- 2.6 File Size Discussion
- 2.7 Filetypes and Upload Limits - Discussion
- 2.8 Sharing cost
- 2.9 Deletion of old content because it is somewhere else
- 2.10 Category:Quality notations
- 2.11 Format Notations - Category ?
- 2.12 Copyright and Inclusion Policy
- 2.13 Inclusion Policy on Licenses
- 2.14 Wikilivres.ca domain will DROP - Next Steps
- 2.15 Unique visitors to Wikilivres now 1,000+ Day
- 2.16 subpages for discussions
- 2.17 spam accounts -- RESOLVED
- 2.18 Pages disappeared
- 2.19 .DJVU format drawbacks and alternatives
- 2.20 Pauly-Wissowa RE
- 2.21 Dead Man's Switch
- 2.22 Expense Report for last 60 days
Provisory discussion started when migrating
Purpose of this page
while the migration to this new site is going on it is probably better not to create new pages in other than user name space. As we shall need a place to coordinate the work in the future I created this page to enable it; when the migration is finished and the site consolidated we can copy the content of this page to the "real" scriptorium as usual. Please be free an edit here and discuss all necessary stuff. -jkb- (talk) 16:30, 31 March 2017 (JST)
- Sounds good. Perhaps adding to sidebar sooner than later is good, as we can now move discussion here from Meta and from Github (so things are in one place). --Jeffmcneill (talk) 16:57, 31 March 2017 (JST)
Restore - Latest Status
Moved to Technical requests page.
Basically; we need to figure all of that out, going forward.
@ the old site, we really had very little "formal" structure; & a very small group of active users, @ least for the years i've been active here.
it is my tentative understanding that the project was originally created by, & then spun off from, the cdn chapter of wp/wm. do not know why, how, or details.
since then, it's gone through at least 2 owners, & some domain & hosting changes.
the actual "working arrangement" (as re: actual work on/in the project) has been pretty informal, with different people & languages pretty much doing their own thing, within the basic framework.
owner-operating decisions have been "ad hoc" by the person(s) in possession of the domain, & paying the bills; with some private discussions among members, but really no formal "community proccess" type consultation or decision-making.
i'm not sure how "fancy" we want to get with setting something up, especially given how small our group presently is; but i'd suggest that we should form a "core group" to operate the site, of the people who were running it @ the old site/those most involved in setting up the transition/& those paying the bills.
they/we can be the informal "officers" of the organisation, pending any more formal/legal arrangements we work out; &/or our actually having a large enough active-membership to (be able to, & to be worth the effort to) run a "community-process", etc.
for the record, i am a canadian citizen & resident. am willing to put my real name foward for use on paperwork, etc. if that is helpful or needed; but would have to keep any money-matters (banking, tax, etc.) completely separated from my "real life" ("properly"/legally separate i mean; not "under the counter"-separate). unless the wmf wants to put us all on their paid staff... :p
- some corrections:
- the old wikilivres havn't been established by canadian Users, but by admins and bureaucrats of the oldwikisource and some stewards
- we had no "owner" or "officers"
- yep, we have some different languages etc., informal OK, but nobody was doing "their own thing", we were a team
- "owner-operating decisions have been "ad hoc" by the person(s) in possession of the domain" - what do you mean by this pls?
- would like to get an answer. Regards -jkb- (talk) (by the way, I don't think it is the right time to organize such ground discussions - we have other problems at the moment) -jkb- (talk) 02:18, 1 April 2017 (JST)
- reply -- was responding to the original questions in this section.
- the site was always "canadian" as far as i know; i.e.; operating under canadian copyright law. not clear on the exact history of its founding; but somewhere in the old on-wiki discussions, somebody said it had been spun off from/by the cdn chapter of wikipedia/wikipedians.
- & clearly someone must have "owned" the site, as far as domain name registration, hosting, & bills are concerned. in the time i've been a member (* ~2012-ish), the site ownership changed hands at least once, to eclectiology. hosting arangements, & i think the exact site name also changed during this time. then eclectiology, later & sadly, died; leading to our present situation.
- i am not saying we were not a "team"; but there was very little formal organisation of the work. individuals or small groups of ppl would work on whatever materials interested them. which is not a bad thing.
- there was very limited interaction between users working in different languages, though
- & no major changes have been possible since eclectiology's death, we've just been working on the content since then; & holding some discussions & planning for what to do in the future.
- by "ad hoc", i mean that the person responsible for the site's domain name & hosting was making the operating decisions & changes as things came up, & more or less on their own initiative; with some discussion with a couple of the senior admins, i think, but no real community involvement in the process.
- i do not criticise this, i merely describe it. the one obvious problem with this approach was that the one single owner held all the "keys", & when eclectiology died there was no way to re-assign control.
well, with all due respect, for the time period from 2012 to now i speak from personal knowledge; about the "founding history" before that, i can only repeat what i have read on the wiki. & the records of all the on-wiki discussions & actions should be present in the backups now being restored.
certainly the site was set up as "canadian", & operated under canadian copyright laws.
certainly eclectiology was the most recent owner-operator of wikilivres.ca, certainly this person has died, & certainly there was no process to transfer control & ownership of the site. which brought us to the present situation.
& as i have said, i was replying primarily to the questions posted by the user who created this section; with some explanation of the previously existing structure & history of the project. if or where the explanations are are in error, please specify & correct them?
Lx 121: FYI I created Wikilivres (as wikilivres.info and then wikilivres.org). At the end of 2009, I could not managed the site any more, and I looked for someone to take over. Ray offered, and I transferred the site to him. He moved the site to wikilivres.ca. Yes, it has always been hosted in Canada to benefit from Canadian copyright law, but I am not Canadian, and I am not in Canada. Regards, Yann (talk) 08:06, 1 April 2017 (JST)
- HI YANN :) good to see you again! & thanks for clearing that up. knew about the reasons for hosting it in canada, & the basic flow of events since i joined; did not know much about the history before that. but somewhere in a recent discussion about the future of wikilivres.ca (on that site), somebody said something about the cdn chapter of wikipedians having passed wikilivres on to somebody else, because they no longer wanted to operate it (we were discussing options for the future). clearly that information was wrong, or at least badly mangled(!) i actually am canadian, btw, & resident therein; so if we ever do need somebody to "front", act as an in-country agent-representative, etc. i can help with that xD Lx 121 (talk) 14:38, 2 April 2017 (UTC)
- Hi Everyone, sorry that my questions ended up with some ruffled feathers. I thank Lx 121 for jumping into the fray. I have probably the least amount of information about the group and its history and the like, and Lx 121 was surely meaning to be helpful to me in answering the question. He was the one who emailed me after the server had been down for several days, asking if I was still going to get involved. In any case, it sounds like a history of this wiki needs to be written down, and shared (and what better place than in a wiki) and that would help all newcomers understand what kind of community this is. I've come across some of the early discussions about copyright on WikiSource with Yann, and see the original impetus for the creation of Wikilivres. That is a very clear point: Canadian Copyright Law has more freedoms than many other countries, and a server in Canada provides that kind of legal shelter for an enlarged public domain, not available to servers hosted in the US, Europe, Russia, and the like. Also, it is obvious that everyone here is devoted to the public domain, or they would be elsewhere. As the newest technical person lending a hand, and in view of the recent events, a little more distributed organization at the top (no single owner, but a small group of the most interested/involved) would be in the best interests of the project. Hopefully this is not the end of the discussion. --Jeffmcneill (talk) 13:12, 1 April 2017 (UTC)
- Jeffmcneill, I don't think that your question provoke the discussion above; and we can certainly discus the hierarchy etc. - my (and Zephyrus') "ruffled" reaction was dedicated to the claim, might be the canadian chapter founded wikilivres, and more over to the claim, that some officers or owners managed the site as tey wanted. No. We were not a smasll but a very small community, but there were no tensions or fighting, we had a good collegial atmosphere and we did our work (which is by the way rather a "wikisource work" not a "wikipedia work"). Thus, and I wonder if I am alone with this evaluation, I will gladly discuss with everybody how to improve this work, but I am jnot ready to discus problems with some owners, ruling officers os something like that, just because (see Zephyrus' edit above) this feeling comes from another world not from the reality of this community. (And more over I will not discus about the canadian WMF chapter as the founder of Wikiklivres, as Wikilivres was never a project of WMF - everybody can read this statement on the main page). So, nice weekend, let's work - cheers < unsigned by -jkb- April 1, 2017 >
how the pages in Wikilivres are to be like
To all. Especially in last days I found some new pages that are not like to represent Wikilivres as a quality domain: pages with up to 1.4 MB text, unformated, just copy&pasted, no headers etc., containing parts that don't belong there - see A Mencken Chrestomathy, see here as well etc. - and I would like to discus this basically.
- It's difficult to say what is the maximum acceptable size, but it must be readable, and such long books must be divided into chapters etc. Up to now it was the normal and accepted practice.
- In addition to the size, the formatting used here must also be considered.
- If someone here uses such a long, not revised and adapted, and as usual formatted article, he can, of course, gradually work on a correct appearance, but I think it is wrong to put several pages of this sort in succession and in large numbers and stopping revise them after that. Also, I think it is wrong to leave the processing to others, that is not a team work.
- If someone has no time to edit such pages soon, I think it is appropriate, as usual, to park such pages first in his user namespace.
I think especially now after relounching of our new Wikilivres we schould take care of a good quality to gain a new good image in internet. We want a quality work, not a high quantity of works - I cite: "...that adding the text of an entire novel here should be a lengthy process and that adding an author's entire works could take years" [Simon Peter Hughes]. Please go ahead and comment. -jkb- (talk) 09:50, 6 April 2017 (UTC)
- I agree that good quality is important for visitors. Dumping large number of texts/images that need a lot of correction should be kept to user space until a certain level of quality is achieved. Better is that works should be done a few at a time. Chapter breaks are also important for readability. It might be good to have quality targets that we agree on, so that all editors can together have a shared idea of what Wikilivres should become. --Jeffmcneill (talk) 16:58, 6 April 2017 (UTC)
a polite observation -- i too appreciate the value of quality, BUT there are only about 8 of us working on here (& right now i am the only one uploading any significant amount of content in english);
either we can have a decent amount of content, which is actually useful to readers/end-users, OR we can only make polished "perfect" copy, only release content once it has been fine-combed, approved, signed, sealed, etc....
...& come out with a grand total of maybe 6-12 books per year.
EVERY YEAR, a new wave of dead authors goes PD; that means, at a minimum, HUNDREDS of notable works per year. we are NOT going to accomplish anything USEFUL, if the only thing we do, is make pretty-perfect texts, one at a time
if that's all we want to accomplish here, we might as well all just go join gutenberg.ca. seriously, they already have everything set up, & they are bigger & better @ processing copy than we are. we don't need a separate website just to imitate them, & do so less effectively; that's just "playing club".
if our goal is to provide a useful resource we need to be smarter, & faster, more innovative.
there are texts available, lots of them; some proofed, some not. from the gutenbergs, & from myriad other sources; whether public domain in their source countries, or not.
most of them are UNFINDABLE & inaccessible for ordinary end-users.
IF we collect them, get them online, in a pma-50 jurisdiction where hosting them is legal
ORGANISE them, so that it is easy for end-users to actually FIND things, & so that they can find ALL the works by an author in one place.
then, we are doing something useful, in REAL-WORLD TERMS; & doing something that nobody else is offering.
AND you are free to polish up the copy, as much as you like.
by all means, do so.
if proofing copy to perfection is what you want to work on, more power to you.
i'm going to work on collecting up pd pma-50 content, organising it, & actually getting it online for people to use.
it does NOT "hurt" wikilivres to have this content being uploaded. it may not be perfect copy; but, in real-world terms for our end-users, actually having a copy of a work, is INFINITELY MORE USEFUL than having nothing.
our texts can be MARKED for their quality ratings; i have absolutely no problom with that. right now, i'm usually categorising the copy as either 50% or 75%
if you want something more exacting, go nuts creating the most perfect quality-rating system you can or want to.
i'll even use it when uploading content; IF it's not impossibly difficult or time-consuming to add the tag.
but PLEASE do not obstruct me doing my work either.
will discuss the technical points of page-formatting separately
- I appreciate this point of view. In addition, the value (quality) of the content of Wikilivres cannot be measured only in bulk (megabytes), but rather in the usefulness to others. Accessibility is great, and is definitely something this site is devoted to, but there has to be actual humans getting value from what is here (visitors/readers) -- accessibility in practice, not just principle. The only way that happens is if documents/pages are presented in a way that:
- 1) the search engines can find and make sense of, and
- 2) that humans can interact with in a usable form.
- * Non-OCR PDF/DJVU are large and not searchable, which makes them much less accessible when it comes to finding them, and making use of them.
- * Unformated/uncorrected text dumps (without headers, subheads, or metadata) is equally problematic.
- in reply:
- i."the value (quality) of the content of Wikilivres cannot be measured only in bulk (megabytes)" -- i never said it was; but the quantity of material we can offer end-users MATTERS. end users arent going to come here for 6 polished, perfect books per year.
- certainly not when gutenberg canada produces 10x that.
- ii. search engine 'optimisation' is a separate technical question; there's no difference in what we are discussing here, that is relevant to bot-processing.
- iii. there is an infinity of difference between "we have a copy of this work", & "we DON'T have a copy of this work. especially for end-users.
- a poor or marginal copy of a work is fundamentally more userful to someone-who-needs-a-copy-of-the-text than no copy.
- for example; there are a number of shakespearean plays for which no copy exists, at all, anywhere.
- is that better than having a poor copy of them? WHICH is more useful to someone who is studying shakespeare?
- as to the merits of pdf/djvu files, obviously it is more desirable to have a digitised text.
- when we don't have that, an optical scan pdf is better than nothing. i would have thought you'd find it preferable to a poor-quality ocr, tbh.
- & whether the content-text is searchable or not (& it's certainly more convenient when it is), the BOOK is findable.
- for example: recently, i uploaded a .pdf of the novel 'scoop' (1938) by evelyn waugh.
- the book was recently in the news as being newly-popular, because the story was "timely"/relevant to current american politics.
- there is NOWHERE online that you can legally download a free copy of this work, EXCEPT "us".
- Google and other search engines do not know about this book, because it isn't available in a searchable form, and there are no links to our page about this book. This means the book is essentially invisible. Because it cannot be discovered very well, it will likely stay invisible. Yes, building some links would help, but having an indexed book with full text would make it that much better. (P.S., I've found a usable epub, here is the tet (I've added it to the page you created, please now your turn to break it into chapters ^_^. --Jeffmcneill (talk) 16:37, 11 April 2017 (UTC)
- THAT is "being useful" as far as end-users are concerned.
- if we were fully set up @ our permanent address, & did a little promoting, we could be getting some "real" traffic on the site, instead of just "us".
Discussion started on the new site .io/.ca
Move to Discussions?
On Quality at Wikilivres
There are still many technical issues to be ironed out, however (I hope) there are improvements every few days. The technical goal is a platform that supports the work of the Wikilivres editors, and makes Wikilivres better for all visitors. From a technical quality approach, we have limited resources (storage space, bandwidth). This means the focus should not be on quantity but on quality (and as much of that as possible). ٩(˃̶͈̀௰˂̶͈́)و
Public Domain Quality
Since we are dealing with a 20 year (or less) lifespan as the place where documents can reside in the public domain (until the rest of the world catches up), It may be important to keep in mind that what we bring here should be worth our labor. For some texts, there will be much more attention paid when they reach a PD+70 term. PD-50 is really about making works freely accessible (freedom from their (now expired) copyright, and free in terms of freely available, on this site). There are a few important categories of documents (books, images, etc.) that would be worth the effort of editors here:
- Hard to find or out-of-print
- Otherwise restricted by copyright holder
- Classics (hard to define, but that have importance now or historically)
The best candidates for Wikilivres (IMO) are expensive, hard-to-find, restricted classics. The worst candidates would be cheap, easy-to-find, mediocre works.
Of course, I am only one editor with one opiniion, and the work of Wikilivres is as broad and varied as all the collective editors who put in their time and effort, whatever they work on. As a technical resource, my aim is to empower editors, while improving technical quality. As an editor myself, my goal is to improve public domain quality. Thank you for your patience during this technical transition, and I look forward to working together, and working separately on our own projects at Wikilivres. Keep the technical requests coming! --Jeffmcneill (talk) 05:15, 7 April 2017 (UTC)
comment -- do not disagree about the desirability of quality works, but:
we are not the only site "publishing" pd pma-50 content; nor the only one "griding out" proofed-copy
(setting aside the undesirability of repeating proof-reading work done on the SAME-titles @ other projects; which is can be considered as a technical point of operations)
what makes us "special" is the flexibility of being an open wiki project.
we can get stuff up faster, organise it better, & cross-link freely with other projects.
when a reader/user comes here, & looks up an author, they should be able to find a complete bibliography,
with access-links to as many of the author's works as possible.
(ideally, it would also be tightly-integrated with wikipedia>wikisource, wmcommons, etc.)
whether those works are hosted on our site or not (though it is certainly desirable that the hosting-resources should be stable & reliable), is an open question. by "default" i've been going on the assumption that "if it's not on wikisource, it should be here" (with interlinks both ways); but i'm open to revising that approach.
we do have better organisation of material than the gutenbergs, however; & we offer titles from any source, not just the ones that have been proofed here.
tl;dr -- if we're going to bother to do this at all, then we should try to do it better than the other projects.
in particular "better" in terms of real-world, practical usefulness to readers/end-users.
otherwise why are we here?
- I define better by the quality hallmarks mentioned above. Certainly with limited resources we won't be able to do better as in more complete, which I think is a recipe for burnout and low quality. --Jeffmcneill (talk) 02:49, 8 April 2017 (UTC)
- then the question becomes why are we doing this?
- literally; IF/THEN
- project gutenberg canada has a very nice distributed proofreading project, & they can at least turn out a few dozen titles a year.
- why don't we just join them, & save ourselves the trouble of operating this site? especially if the goal is to crawl out maybe 6 books a year. it's not worth it; operating a standalone project just for that.
- I worked on Project Gutenberg at about 2010. Maybe it all changed there now, but at that time to start a new project I needed permission from the admins. Also I was the only Russian working there at that time. That's why I moved to wikisource. Here everything is like wikisource, therefore it suits me. -- VadimVMog (talk) 05:59, 10 April 2017 (UTC)
- ad "permissions": something similar you can find on the German wikisource - the community likes to permit a project, if not, you cannot do anything. If you know the situation on some wikisources so it is a good idea to incereas the quality, as many user start to insert a huge work of somebody and after formating one chapter the are gone away. All you can find from them is awfull and worthless junk. -jkb- (talk) 08:51, 10 April 2017 (UTC)
- :) You are right, the problem exists. Sometimes I'd like to do something like this on Russian wikisource -- lots of vandals. Still I remember me feelings at that time and I left Project Gutenberg. On the other hand trusting every new user, we show respect to good people and they stay. It takes more efforts to clean, that's the price. -- VadimVMog (talk) 17:09, 10 April 2017 (UTC)
- P. S. Returning to the quality of this site. If there will be no admin cleaning out all dirt, life itself will force us to implement some strict policy to keep quality: either not to allow new users at all (like it was on wikilivres.ca) or to control new users (premoderation). What concerns our work quality, we can only create guidelines. What else can we do? -- VadimVMog (talk) 18:51, 10 April 2017 (UTC)
In my opinion the quality of edition is important. In my area I try to apply here the standards we have developed on Polish Wikisource and "my" wikii site "Ogród Petenery" ("The Garden of Petenera", see: wikia:c:wiersze), the wiki with Polish poetry published on free licences and that gathers old PD texts, too. Electron ツ ➧☎ 09:50, 10 April 2017 (UTC)
- re @ VadimVMog: I hoped we shall need no special guidelines or rules as both on wikisource and wikilivres.ca we had no problems with this - so nearly everybody used to present good OCRed texts, with headers, chapters, source, npo copyviolation, with the right licence etc. But might be we shall have to discuss this again. In last time there is a quite big amount of texts that do not fit these expectations (@Simon Peter Hughes: has been correcting many of the - many thanks for it). See #how the pages in Wikilivres are to be like as wsell, but see also the in the meantime archived talk deletion appeal. I'm pretty sure that some texts that are not formated in the normal way and quality we know from wikisource do not belong to the main namespace. Wehen I come back in some two weeks I shall suggest that (if not deleted) these texts will be moved immediately either to a user namespace or to a special namespace, until they fit our quality requirement and expectation. (The same problem is the question, if we want pdf-files in the main namespace etc.) -jkb- (talk) 21:27, 10 April 2017 (UTC)
a suggestion for something new for the main page (EN; though certainly not opposed to expanding it to other langs)
now that things are almost "back to normal" on the project, something i was thinking of proposing @ the old site, just before everything went blooey:
what about adding a PD comics section to the main page?
updated daily or weekly, say (can promise weekly, cannot guarantee daily by myself)
there are a wide range of titles to choose from; we could plan a rotation, or randomise, or etc.
would be fun, & would add a little variety to our mainpage; which is rather "static" & doesn't get updated a lot.
- A featured works section regarding new works might be nice. Do we have comics? Is that something important to this community? I see a lot of scientific nonfiction and literature here, maybe I'm missing something. --Jeffmcneill (talk) 02:49, 8 April 2017 (UTC)
- i started uploading buck rogers strips sort-of-daily, before the old site went down. only 4 of them made it into the new site.
- the are at least dozens of major comics of the past that are now pd, & for which at least some of the work is find-able. i'm willing to spend a limited amount of time on that; i could promise at least a weekly update to a comic on the mainpage; which is more regular attention than the mainpage has received, ever.
- presently we have a small collection of art-images, mostly from commons i think, that we can hosty, & they can't. like the kandinsky thing on your talkpage.
tl;dr -- the problem has always been "not having (enough) people to do the work". (actually, we have a new/recent works section on the mainpage; it was last updated in early 2016 xD) Lx 121 (talk) 05:38, 8 April 2017 (UTC)
our present title-header template seems to be doing something that "supresses" the little wiki menu-box for page-sections.
i need it to stop doing that.
ultimately, the "ideal form" for our texts seems to be "wikisource-style"; with a book broken down into multiple pages, chapter-by-chapter.
doing this is VERY time consuming (particularly absent auto-tools).
as an INTERIM STAGE in this process, i have been experimenting with "sectioning" the text (i.e.: also "parting out" the book, but on a single page).
it dramatically increases page-navigability, & is not nearly as time-consuming as creating multiple pages manually.
ALSO this makes it easier to divide a book into multiple pages later. (this could even be automated & done by a bot; with human review)
unfortunately, something in our page formatting setup is "supressing" the menu-box with the page-section links. i assume it is the header.
IF somebody can fix that, that would go a long way to resolving certain disagreements we are having elsewhere, about copy.
- Please feel free to look at the templates and suggest on their talk pages what changes to make. You can experiment in your sandbox to get the right look. It may be something like __NOTOC__ that is doing what you describe.
p.s.: two observations
1. any "page display size restriction", as on certain types of devices, or etc. will always be a temporary, "ephemeral" limit. every "next" generation of software & hardware will "move the bar" higher. 1mb(+) is NOT an unreasonable page-size; IF we are going to worry about this, then at least, we need to keep UP TO DATE on the ever-changing technical standards involved.
- While storage is growing quickly, bandwidth is not so much. There are a large number of projects (e.g., Google AMP) that try and deal with high-latency, low-bandwith networks (especially for mobile) so the idea that large page sizes will increase (and 1mb is acceptable) is not bourne out. In order to be liked by search engines, and by users, it is important to keep the page size small. I've been watching this for the past 20 years and that ain't gonna change anytime soon (as much as we'd all like it to). Page size is not the place to put the constraint (e.g., certain size of all page elements) but rather the user experience of how long it takes to load enough of a page to being interacting with the content. Two seconds is a useful goal to strive for. Less than that, even better, but more than that and a closer look at how the pages are being delivered (and size of assets) is needed. --Jeffmcneill (talk) 03:05, 8 April 2017 (UTC)
- comment -- mediawiki is incapable of delivering 2 second page loads now; not on wikipedia, & not here. it's nice to have goals; but we're going to have to change software to deliver that number in "real world" use.
- Wikipedia achieves 1 second load time, and this wiki is currently at 2.9 seconds (using GTMetrix and a Vancouver location). We can get to 2 seconds in North America, certainly. These are the kind of technical things I have done before with other sites. However, the page size can't be big to achieve that. --Jeffmcneill (talk) 16:37, 11 April 2017 (UTC)
- comment -- mediawiki is incapable of delivering 2 second page loads now; not on wikipedia, & not here. it's nice to have goals; but we're going to have to change software to deliver that number in "real world" use.
- & i understand what you are saying about low-bandwidth, but even on low-bandwith 1mb is "small change" nowdays. at least in the developed world; except maybe for extremely rural, or wilderness areas. phones use more than that, just "talking" to the network.
- I'm sorry but that is not accurate, and it seems the argument is on the one hand that large page size is ok, and on the other that if we really want to address the issue we have to have very small page sizes. The happy medium is achievable. I can do the server side optimization. Content sizes need to be reasonable. There is no other way. --Jeffmcneill (talk) 16:37, 11 April 2017 (UTC)
2. in its present state, our site is not set up to be "friendly" to users trying to read or download a long text on a mobile device, with a small screen; whatever the page size. IF becoming mobile-device friendly is a priority, then we need to get serious & set up a working group for that. because, right now "we ain't got that".
- I agree about the usability of reading on Wikilivres, and also about mobile-friendliness (two different, but related issues). --Jeffmcneill (talk) 03:05, 8 April 2017 (UTC)
3. uploading a large text to a single page is one stage in "processing"; by all means, DO move it along to the next stage. if or when you have the time & inclination to do so.
BUT if the choice is between a) having a text, & b) NOT having a text, then as far as end-users are concerned, something is better than NOTHING.
&, at the end of the day, end-users MATTER. we are here to provide a service, & resource: free copies of PD content, for people to USE.
- Actually a bunch of poorly formated text is probably the same functional value to an end-user, not usable. If the idea is that someone feels they can dump poorly formated text into Wikilivres and it is up to others to do the tedious text formatting, that seems mistaken. --Jeffmcneill (talk) 03:05, 8 April 2017 (UTC)
- 2 points:
- 1. for an end-user, in real-world terms, the difference between "we have a copy of this text" (even a less thjan perfect one), & "we don't have a copy of this text" is infinity. it is the functional value difference between 1 & 0.
- 2. this is a wiki; hence a colabourative project. where people freely chose to come, & contribute the work they are interested in doing.
- if we are going to create a bunch of passive-agressively restrictive rules, to "force" people to do things, we are never going to attract a userbase.
- after 14 years of operation, & with all the advantages of wikimedia & wikipedia, ws/en has an active user base of about 100 ppl; & an annual output of work that is a fraction of what gutenberg usa puts out in the same time period.
- if we simply, dumbly imitate wikisource, we are imitating a failing endeavour; and/or "playing club". we don't need this project to do that, or to dumbly imitate gutenberg canada; why bother doing that? especially when we could all just join there.
- i have uploaded about 1-2 works per day, into the english section; which is more content than the english section has gotten in a very long time. many of them are very hard to locate online.
- (i feel it is also worth pointing out that the user who raised the point doesn't even work in the english section)
- my goal is a minimum average of 1 work uploaded per day.
- i have also vastly improved the coverage of the bibliography pages of the authors i've worked on.
- & included links to works elsewhere (ws & the gutenbergs).
- for end-users; for ordinary people who come here looking for stuff, & who don't care about anything else on the wiki but that, the "utility" of these author pages has been increased at least exponentially.
- if you really & truly do not understand the "value" that this adds to the project, then i am sorry for you; & sorry for the project.
(there is also a "legally useful" aspect to our work; in that by PUBLISHING content we are "asserting" public-domain rights, & thereby protecting public-domain rights. use it, or risk losing it)
- One does not need to publish to preserve/protect public domain rights. There is no loss of rights in the public domain (that is actually the default), the loss of rights is to the author's exclusive rights (that are limited by law). --Jeffmcneill (talk) 03:05, 8 April 2017 (UTC)
- "on paper" yes. in practice -- legal rights that are not "exercised" & defended are
- "on paper" yes. in practice -- legal rights that are not "exercised" & defended are
- Sorry but this is not factually correct. One must defend exclusive rights, but the default is a lack of exclusive rights (and therefore the public domain). Publishing/distributing public domain works has zero legal impact or value. We need to be clear on this. --Jeffmcneill (talk) 16:37, 11 April 2017 (UTC)
- when the usa started, their copyright was 7 years. then it was 14. then 28. then 56. then 95-120 years for "old" works, & 70 years pma for "new" works.
- in practical terms NOTHING has entered the public domain in the usa since 1998 (via expired copyright). nothing is "due" to enter it, until 2019
- before that, there were already repeated copyright term extensions enacted by congress; starting in the 1970s.
- in practical terms, it is increasingly unclear if anything is going to enter the public domain in the USA, ever again (via expired copyright).
- & canada almost lost pd pma-50 to the TPP.
- one of the several useful purposes that this site serves, is to exercise our legal right to the public domain, by publishing pd content;
- especially recently-pd content.
for a simple analogy, what we are doing is "beating the bounds" (marking out the boundaries) of the public domain in canada, with full & regular "updates" to those bounds.
- This is not legally correct. Let's be accurate here. --Jeffmcneill (talk) 16:37, 11 April 2017 (UTC)
File Size Discussion
- Moved from Wikilivres:Technical requests
Filetypes and Upload Limits - Discussion
Hi everyone. I can increase the range of acceptable filetypes and also file size. However, we don't want a situation where the disk gets full immediatelly. First, what filetypes do people need? Certainly: .gif, .jpg, .png,.pdf,.djvu, .epub. Compressed files are more difficult to deal with (where does the decompression happen), but maybe there is something clever where they are compressed and decompressed on the fly (I'll look into this). Let me know more about .xml filetypes, as I am usure what is in this format or how it would be used by the wiki or by users. What size files are reasonable, in the opinion of Wikilivres editors? For example, what fits 50%, 80% and 100% of what people have to upload?
While unlimited storage will be nice at some point in the future, there are certain issues (such as backup window expansion, and paying for that storage). Thoughts? --Jeffmcneill (talk) 06:46, 4 April 2017 (UTC)
- comment - would be nice to offer a full range of media file types; though even w/o storage issues the selection is going to be limited (1st disney cartoons go pd in a few yrs though; once ub iwerks reaches pma-50)
- optical page scans use way more data than ditigised text, but most books i've been finding come in under 30 megs; many well under. a few of the bigger items, like some of churchill's grand histories, are bigger though. nothing i've seen was @ or over 100 megs; except one a a milne text that was ridiculously large (1.5 gb; i assume because somebody made a bad choice in the settings somewhere. would not even want to try uploading that here; it is unuseable. might try compressing it, if i can't find a smaller copy)
- for graphical books though, total file sizes might be bigger; but it's better to have those broken down page by page anyway.
- tl;dr - for optical-scan ebooks 90%+ should fit under 30 megs; maybe 50% of them will be less than 15 mb.
- we could also explore ways to compress items more.
- but for a few special works, we might need to make exceptions (at least until we can digitise the text).
- for digitised text, without "photographs" of the book's pages, file sizes are much less.
- don't know enough about the options, to say anything useful about hosting storage cost considerations right now. except the obvious that free is always good, & also it would be lovely to find generous sponsors/partners for our project xD
- p.s.: to clarify; i need this only for books where the only available copy is a "pure" optical scan (w/o the text digitised). so far, that is a minority of the works, but it includes a number of really "key"/important items. Lx 121 (talk) 07:35, 4 April 2017 (UTC)
- I'm afraid I will slightly oppose: Wikisource (and Wikilivres as well) are primarily for optical text, and we mus discus sometimes, what we want to publish. So I can remember there was a discussion about the great amount of PDF's that nevber have been digitlized (Eclecticology demanded a digitalization, I have the same point of view). And, if we can use the depository of Commons - why to pay for it? -jkb- (talk) 07:45, 4 April 2017 (UTC)
- comment -- wikilivres is not just a "copy" of wikisource. it never has been, nor should it be.
- & our collection is more that just texts, & it always has been.
- & the answer to "why not just use commons?" is obvious, & it is exactly the same as the answer to "why not just use wikisource?"
- BECAUSE WIKILIVRES IS PD PMA-50, & commons is NOT
- if you only want to work on bare texts, then i have no objections to this; but in return, i ask that you please do not object to other people who want to work on other things.
- Live, & let live? :)
Quote from the project description @ the top of the main page
The purpose of this site is to host texts and images in the public domain, or under a free licence. This site is hosted and managed in Canada and therefore it follows Canadian copyright law. Unless otherwise stated, all texts hosted here are in the public domain or under a free license. For a detailed discussion see Inclusion policy. You are welcome to publish files and texts here if they cannot be accepted in Wikimedia Commons and Wikisource. Works that can be accepted at those projects should be published there. You can test the publishing and editing processes used on this site in the Sandbox. This site has more than 3,000 books and documents, and 11,563 images in Template:PAGESINNAMESPACE:0 pages from more than 1,085 authors. This site does not belong to the Wikimedia Foundation.
(also, @ some point in the future, we might want to provide services such as audio books)
- Oh, thanks a lot - now and here I've learnd what I didn't know before ;-), and more over on a very kind a collegial way -jkb- (talk) 08:47, 4 April 2017 (UTC)
- Yes, a lot of files were transfered from Commons before being deleted there. Regards, Yann (talk) 12:56, 4 April 2017 (UTC)
Update on Filesizes / Filetypes
Hi folks, there are a few things to discuss, based on an analysis of the images folder content:
- 121 items over 8mb in size. - 75 items over 16mb in size. - 26 items over 32mb in size. - Top four are images, from 56-93 mb in size (three Matisse and a Derain).
There are also: .ogv, .ogg., .odt, .mid files (very few). Mostly the files are .djvu, .pdf, .jpg/.jpeg, .gif, .png, and a few .svg
There is one book that includes what appear to be pages from a German excyclopedia on the Classics, which consists of 8,274 png files. This is 1.97gb of images. This was user K67y. Maybe they were looking to turn the pages into text? Unclear (though that is what the Wikisource topic has done). In any case, the project had been worked on from 2010 until mid July 2016 on and off. The project looks to be interesting, but very little headway was made: Category:Paulys_Realencyclopädie_der_classischen_Altertumswissenschaft. However, an enormous number of page images were uploaded. See <https://de.wikisource.org/wiki/Paulys_Realencyclop%C3%A4die_der_classischen_Altertumswissenschaft>.
There are actually not very many PDFs in Wikilivres. Most files are images of some kind, and there are also many .djvu files.
Suggestions and Questions
- The large scale importing of one-imape-per-page should be avoided. This project would have made more progress if some number of pages were imported, and then the text part were done for those pages, and so forth. Thoughts? - Images (under PD-50 copyright) seems to be a natural extension (and indeed, many books have images). I especially think works of art are of interest. Thoughts? - Audio and video (under PD-50) is very interesting, but the file sizes can get very large. Also, if video (or audio) becomes popular, then monthly bandwidth limits could be impacted. I suggest keeping these filetypes disabled. Thoughts? - Any thoughts on what would be encouraged / discouraged regarding filetypes (especially pdf image files (that can be very large)?
- Until we find some money to support enough space, I would turn off uploading of new files, as it was on wikilivres.ca. Only texts. Right now, as I understand, we experience problems financing current site, which is only $35. -- VadimVMog (talk) 07:12, 7 April 2017 (UTC)
- Hi User:VadimVMog, I agree it seems we have no resources. But I think people are not sure how long this server will stick around, before investing. I'd like to build confidence, and one way is to allow uploads. I've set a limit of 16mb currently, and for the image and ebook filetypes. That should cover most text-based ebooks. If people have something bigger to upload, they can discuss with the other editors to gain support. I'm sure (I hope?) we will have enough donations by the end of April for the domain name. --Jeffmcneill (talk) 09:08, 7 April 2017 (UTC)
- I asked on Russian wikisource for donations to this site, but not many people work on Russian wikisource. So I wouldn't put my hopes on it. Speaking of space and filesizes, I would count, how much space we should reserve to text. Then it would be possible to set a limit for images, the limit we do not want to cross. I would like to upload images for books, that adds a lot to a book. I think, I can live without pdf's and djvu's. It is offen possible to provide a link to outside pdf or djvu. -- VadimVMog (talk) 09:38, 7 April 2017 (UTC)
- Currently we are at about 6.4gb of images (including thumbnails), but ~2gb of that is a single work as mentioned earlier. The database is about the same size. Looking at 10% growth/month, we could add 640mb in files/month and have more than a year, while keeping the same amount reserved for pages/database. I do agree that the pages should be a priority. Djvu is definitely superior to PDF and should be the first choice of formats. I'll look at organizing digitizing, ocr and file conversion information. --Jeffmcneill (talk) 13:08, 7 April 2017 (UTC)
- Many of previously uploaded files can be delated, because they were uploaded from Commons couse we had problem with strigth links to Commons - they don't work properly and nobody fix it. Now they work, so we can delete the files and after that we can delate them stright from our inner wiki base to save the room. I do not know that it is a big problem, and all can be dane, but it is a good idea to think... Electron ツ ➧☎ 08:41, 8 April 2017 (UTC)
upload size limit seems to have been changed (since yesterday); now it is 16 mb.
is that where we are setting it? would sort of like a bit more "room" (esp for the churchill histories, & some other "major works"); but moreso wanted to be clear on it, one way or the other, so i can plan accordingly.
- I opened up the upload size so I could get all the old stuff in, and that high limit was not meant to be permanent. Sorry if that was the sense anyone got. 16mb is not huge, but it should support a lot of texts, images, and ebooks. Anything larger can always be discussed.
- One issue with files is not the original size, but backups. If we want reliable backups from different times in the past, we need to have multiple versions. For example, daily for 1 week, weekly for 1 month, and monthly for 1 year = 15 backups. Multipy this by the cost of backup storage, and of transferring the file to backups and there is a non-zero cost of every mb. That said, priority should be on text-based documents, and images that are not huge. PDFs that are image-only (not searchable) are the least favorite kind of content, from a filesize perspective, but also from a user usablity and search-engine findability perspective, much less delivering content over weak networks to limited devices.
- I think getting decent OCR available will help fix this situation, more than increasing the file upload limit. Thoughts? --Jeffmcneill (talk) 10:27, 11 April 2017 (UTC)
- aree about the desirability of text-readable books-files; but for many works in our "range" we're lucky to get anything; & for many of the "biggies" like churchill's histories, image-only is all i can find presently (unless we buy & copy the ebooks; i have no funds for that, though). will look on fileshare when i can, but right now i can't "dedicate" a machine to it (my desktop unit is down).
- aree that having ocr-capability will help greatly.
- but for larger works, or more graphical works, they just won't fit into 16mb. perhaps we could have a case-by-case process for that? or a "special project", or etc.?
New Filetypes Supported
- please increase the range of file types accepted for upload? .pdf .djvu .epub most critical; possibly .gz & .xml moderately larger max file sizes would be nice too; i have a growing number of churchill, milne, & mencken pdfs (optical, not ocr'd) that are too big to fit here (>__<) Lx 121 (talk) 22:38, 3 April 2017 (UTC)
- This is partially implemented, as there are some special extensions to manage the viewing experience of .djvu and others. However, the file extensions are now supported in upload:
- Here are the supported file extensions: 'djvu', 'epub', 'gif', 'jpg', 'jpeg', 'pdf', 'png' 'svg', 'tif', 'tiff'. I've also increased file sizes. But there should be consensus on what that limit is, and also the need to not have optical-only pdfs, if at all possible. --Jeffmcneill (talk) 16:38, 6 April 2017 (UTC)
Hi, There is a bit of this discussion in the domain section above, but we can enlarge the idea. I can't financially support a whole site like Wikilivres now, but I can help a little bit. What do you think? Is the Paypal account email@example.com OK for that? Regards, Yann (talk) 11:05, 13 April 2017 (UTC)
- Hi @Yann: I'd like to take the money so far and reimburse @Koavf: for the wikilivres.ca renewal, as that was what the money was going to be used for (domain registration). I'll set up an email at firstname.lastname@example.org in a few days and connect that to paypal, so it looks more legitimate. Costs are not that expensive, from a hosting perspective. I found a great deal on the current host $10/mo (USD). Though there is also DNS ($ 0.50 (fifty cents) monthly), and backup costs (to Amazon S3 (maybe $5/mo, depends on size of backups). Another option is to get sponsors for certain specific content. --Jeffmcneill (talk) 13:54, 13 April 2017 (UTC)
- This is much cheaper that what I paid, around 50 US$/month, not including domains. Regards, Yann (talk) 18:13, 13 April 2017 (UTC)
- This is the cheapest I've paid also, especially since we have 8gb of ram and 4 processors. But the host doesn't have any extra tools like Linode does (for which we pay about 2.5x, including daily full backups of the entire system). Hosting costs keep going down, also backup storage costs, though not as fast. I figure < $20/mo not including the domain. Though depending on what kind of growth path we will be on, that will increase over time. Also, the server resources and kinds of backups are better than the previous configuration. We get more for less (if only the rest of life were this way). --Jeffmcneill (talk) 04:40, 14 April 2017 (UTC)
- This is much cheaper that what I paid, around 50 US$/month, not including domains. Regards, Yann (talk) 18:13, 13 April 2017 (UTC)
- comment, various -- what is the DNS fee for, exactly?
- re; backups: i think we were getting free backups from the wikiteam ppl? though not instant or daily.
- very cool that we got the name back! any hope of recovering he rest of the lost content?
- no objection to the idea of sponsors. we could even consider advertising; as long as it is carefully chosen. we do not have the same conflict-of-interest problems as wikipedia; in that we are not creating article "about" topics/subjects. we are "just" hosting content; & our main mission & purview are pretty clearly defined.
- we would need to have some kind of formal, credible "business structure" to operate on that level though; a "foundation" of our own? or to join somebody else's. back on the old site, was wondering if we could get one of the wikipedia regional chapters to take an interest in the project.
- re: hosting: as long as it's reliable, & has the capacity we need for storage, bandwidth, etc. cheap is probably more important than service "extras". we can always make or arrange for our own tools.
- re: restore, lost content. understood that was it for the wikiteam backup. was wondering if we stood any chance of getting any of the lost content from the old host (since we have recovered the domain name from them)? stuff from after the last backup was made; abt a month's worth of work. Lx 121 (talk) 10:14, 15 April 2017 (UTC)
- Hi @Lx 121: we don't have access to the actual host, only have the registrar nameservers for the domain point toward my servers. The hosting package probably expired earlier, and was then deleted (which is when Wikilivres became offline). We could get the registrar/domain name back working, but the server configuration and all the files are gone. Best to put out of your mind there is any possibility of recovery. We have what we have and are moving forward. --Jeffmcneill (talk) 04:13, 19 April 2017 (UTC)
- Hi, I do not want advertising on Wikilivres. Otherwise I would have done it long ago, and I wouldn't have to transfer the site to someone else in the first place. Seeing the hosting cost now, I don't see the point anyway. Regards, Yann (talk) 08:35, 16 April 2017 (UTC)
- I can respect that. How about sponsorship of selected pages? If Wikilivres is to grow, especially with more functionality (audio/video) then there are greater hosting costs. If we wanted to allow for larger collections of high-resolution artwork, again, more storage is more hosting costs. What I am thinking of would have to be acceptable by those who work on particular works, and not violate link-buying rules for Google. An example site-wide would be asking the hosting company to sponsor and provide a hosting provided by XYZ link at the bottom of the page. I'm not sure we want to do site-wide links, but this is an example of unintrusive sponsorship.
Deletion of old content because it is somewhere else
I'm seeing content deletion for content that has been on Wikilivres for many, many years, sometimes close to a decade. The reason provided, is that the content is at another location (e.g., Wikisource or WikiCommons). I'd like more clarity from the admins about what was policy in the past regarding duplicate content. It concerns me because removing content from here doesn't mean the content will remain at other locations. More importantly, content that has had a history here, is linked to from other sites, and has visitors based on those links, and the signals it provides the search engintes. Deleting content and not updating links breaks the web. Creating external redirects is difficult to do on MediaWiki, and has many negative points to it, so healing broken links is generally not plausible. While it is good to have a focused purpose for Wikilivres, a temporary shelving area that services other wikis is probably not the main goal. I'd really like admins to help me understand the history and practice regarding content deletion in the past. --Jeffmcneill (talk) 06:23, 15 April 2017 (UTC)
- which content in particular is getting deleted? haven't been watching closely, but the only thing i've noticed was one user's approach to "housekeeping"; deleting local image files which are duplicated from commons, giving "preference" to simply linking to the files there. if more than that is going, i haven't seen/noticed it yet.
- not clear on what or whether we had "policy" on this (duplication) beyond the basic idea that we were here to host material that can't be hosted @ the wikimedia projects (due to copyright rules there). setting aside image content that meets those criteria, i think "we" just started uploading the author-image files here when/because linking with those files on commons stopped working.
- i'm flexible on what policy we should have for this.
- 2 obvious points:
- 1. the other site needs to be a stable, reliable online resource
- 2. if we keep it here, we need to consider the impact that has on our site's hosting resources.
- As I remeber some content, esp. in Russian was moved to wikilivres.ru, because they keep non comercial, and non derivate works too; we don't keep them any more after our policy was change a few years ago. And some content was deleted because it was moved to wikisource, because it is PD now evrywhere, not only in Canada. I haven't deleted some Polish works here (although some of them were move to other services, already) because in my opinion it is better if they can be accessed from different servises. It is also more safer for them and we don't gain any new space if we deleted them here, because if they are "deleted" they are hide only for non admins and they still staked out a place on the server. Electron ツ ➧☎ 10:09, 15 April 2017 (UTC)
- Ray enforced a much stricter license and content policy than I, and deleted everything which was under a NC license, and most of what was already hosted on Wikisource. Copying what is on Wikisource is useless, but I don't see the point of deleting what is already here. It won't save storage space, and it will break some external links to our content. Regards, Yann (talk) 08:39, 16 April 2017 (UTC)
- I agree there is no point in deleting stuff, with only very few special exceptions. Also, if pages/content are popular, they should not be deleted, and we won't know that until we get a better picture from the analytics I have running. Even the 2,7xx images (2gb) from that German encyclopedia, that drives me crazy, are getting visits, and I need to look closer to see if it is useful to our visitors, before considering deleting that content. Please folks, do not delete content unless it violates copyright or trademark policy. We can discuss deletions if people feel strongly content should not be here, but please discuss first. This is for both old and new content. --Jeffmcneill (talk)
Beside of the question of licenses (like NC - see above) there is no reason to delete something, that was published in Wikilivres according to the scope of this project. If some years later the text is not only PD 50 but even PD 70 - so what? Once it was publiwshed here and hows what we have done in the past. And we surely published the text before it was publishd on any Wikisource. Is is our past, the sign of our work and we shpould show all we have done. (OK, I speak here about texts, not files.) No deletions please. -jkb- (talk) 21:51, 24 April 2017 (UTC)
The current Quality notations categories are as follows, with # of pages in each category, and the English description.
- Category:0% (144) - Works in project
- Category:20% (1) - Incomplete works
- Category:25% (289) - Incomplete works
- Category:50% (400) - Works completed but typography and layout to be corrected
- Category:70% (54) - Incomplete works
- Category:75% (5,298) - Works completed, including typography and layout
- Category:100% (2,974) - Proofread articles, on Wikilivres or by an external party
- It seems that 0% are NC works that have been blanked (contents removed but not deleted).
- 20% has 1 in it, so easy to move that to 25%
- 70% description seems to be the same as 25%, so unclear on that
- 75% seems to be an understatement, if the work is really complete
- 100% is a great idea, but proofreading is ongoing for wikis (in the case of correctable text)
So there seem to be three categories at base:
- Incomplete text
- Complete text, incomplete formatting
- Complete text, complete formating
- I'm not against the idea of an improved quality&status rating system for content; can think of many different ways to go with that, haven't got deep enough into it to have a "best" choice. The present system is probably mostly borrowed from ws, at the time this place was started up; application of it has been pretty lax/hit & miss. We haven't really had the "critical mass" of ppl to operate such things. --Lx 121 (talk) 07:27, 15 April 2017 (UTC)
- For having 9,000 articles tagged, I would say that someone was doing a lot of work determining quality levels. If we simply work with what we have, collapse into three categories, and have some bots do category replacement, that would be an improvement. If bots could tag the top of a page in a given category with a visual indicator of the quality level of the work, then visitors (and editors) would have the information more visibly prominent. --Jeffmcneill (talk) 08:49, 15 April 2017 (UTC)
- did not know it was that many (you're right though, should have paid more attention to the numbers you posted; my apologies! had just gotten up, when i started on here). might be from "early days"; don't think anybody has been updating them in quite a while.
- I don't see a need to create a system that will be more complex, harder to use (and therefore avoided) and for what purpose? If we work with what we have, and simplify it, wouldn't that be better (better as in possible to implement quickly and be easier/simpler to use)? I'm suggesting options to fix something that seems to not work as well as it could. --Jeffmcneill (talk) 19:04, 15 April 2017 (UTC)
- Hi, I have always noted the completion status of the content I added. It is important to me than the reader is informed, specially if it was checked with a reliable source. Regards, Yann (talk) 08:44, 16 April 2017 (UTC)
Format Notations - Category ?
As mentioned above, Quality Notations does not indicate what kind of format a work is in, which could be:
- Full text or Image-based pages - HTML (wiki), PDF, DJVU, Epub, Mobi, or some other format
- Starting with full-text markup-based formats (html/wiki/epub/mobi), all other formats can (in theory) be generated. Certainly wikitext to pdf. - The same is not true starting with page-based formats (pdf/djvu) as even if full text, formatting results need to be completely redone manually. - Without the overhead of full OCR plus manual proofreading, it is not possible to get from an image-based format to a text-based format.
It would be great if anyone visiting Wikilivres could easily understand what format given works are available in. Besides file formats, there is the basic: read-online vs. download. The Archive.org is fairly good about showing different download file formats.
- Are there any categories or templates in-use that indicate filetype or file quality?
- For completed works that are in wikitext, is it generally the case that each chapter is a wiki article?
- Is there some kind of format that is generally used (or set of templates)?
- comment -- i think we basically just copied the process used by wikisource/en; but never had even their level of "manpower" to implement it. am certainly open to new/different approaches; haven't put in enough time thinking about it to have an opinion on "best" here either.
- clearly convertibility & downloadability are top considerations.
- except for page size limits, not clear what advantages there are to breaking up a work 1-chapter per page.
- Large files are also an issue when it comes to indexing a page (and also formating it for different filetypes, such as pdf, ebook, etc.). A book chapter (though the size of them varies), is a good breaking point. The epub format helps a lot in this. Each chapter gets an H1 (which is what each page on the web is supposed to have, one and only one H1). For search, Google doesn't want to index 50,000 words from a page (which is an average-sized book wordcount), but rather 2,500 words in 20 chapters, or something like that. Chapter-sized wiki pages are good for users, good for mobile, good for Google, and good for organizing book elements (for other formats). --Jeffmcneill (talk) 16:24, 18 April 2017 (UTC)
Copyright and Inclusion Policy
Hi, I've made a Flowchart on how to determine if a foreign work is in the public domain in Canada. This is my current understanding regarding foreign works and Canadian copyright policy regarding the public domain. This applies to all Berne treaty and WTO/TRIPS accord signatories, which is all but a handful of nations. Please let me know what corrections to make to this image:
Inclusion Policy on Licenses
Currently the following licenses are accepted: PD, FDL, CC-BY, CC-BY-SA, ArtLibre.
- GnuFDL actually has some problems with it especially with short works. There are two clauses that can cause a lot of limitation on derivative works. Others have pointed out these limitations, and for that reason WikiCommons recommends dual-licensing if using FDL. I think it is more trouble than it is worth. There are only a few dozen items labelled this way and in some cases the use of the category is not accurate to the license provided, or there is multiple licensing. Note that GFDL is incompatible (both ways) with the GPL.
- ArtLibre is Free Art (in English) and it is not as well known (in the English-speaking world), as it is a French licence (though translated). Looks like this has been applied to 8 transclusions.
There are a few more licenses that work with what this place is (Public Domain + Free License). Note that several of these come from software so they won't necessarily be applicable (though some like GnuGPL are). The main goal is to allow for material under different licenses to be able to live on Wikilivres (and vice versa).
When uploading a file, these are the current options:
- Public domain (author's life +70 years old)
- Public domain in Canada (author's life +50 years old)
- Public domain in Canada (shorter term)
- Ineligible for copyright
- Creative Commons - Attribution 3.0
- Creative Commons - Attribution-ShareAlike 3.0
- GNU FDL
- Not free
- Not in public domain
I think this is complicated, and a bit misleading. Either something is in the public domain, or it is not, and a license needs to be declared. In addition, the terms of being in the public domain only apply within a given country. So the fact that there may be a Life+70 public domain in a country somewhere, does not apply to Canadian law. I kind of get where this was going, where people could find out where they could use the work. But this is different than licensing.
Suggested (first step) simplification
- PD - Public Domain (work already in the public domain)
- CC-0 - No Rights Reserved (copyright holder has released all rights)
- CC-BY - Attribution (creators require attribution)
- CC-BY-SA - Attribution Share-Alike (Attribution + derrivative works require same license)
- GPL - Gnu Public License v3 or later + font exclusion (For those who want a Gnu license)
Here are some relevant links to the licenses:
- Public Domain
- No rights reserved (CC0)
- Attribution (CC-BY)
- Attribution, Share Alike (CC-BY-SA)
- GPL with Font Exception
- I am not a lawyer but I know that CC licences have some slight differences between thair issues, so eg. CC-BY-SA v.2, CC-BY-SA 3.0 and CC-BY-SA 4.0 are not the same licenses for the lawyers (there was rumore on Wikipedia because they changes licence from GNU FDL and CC-BY-SA 2.0 to 3.0 same time ago). What about PD: in my opinion it is important for people from others coutries to know is it PD-50, PD-60, or PD-70... or even it is PD-100 for people who lived in eg. Mexico. Electron ツ ➧☎ 10:09, 16 April 2017 (UTC)
- Public domain laws by country are different from licenses, so these should not be mixed up.
- I do understand the desire to help visitors be informed, but there is no PD-50, PD-60, PD-70. There is only the public domain and for which country/countries. Under different countries laws there are different durations. However, there are really a lot of exceptions and generally a lot of the bottom labels on the documents/authors are simply not correct. We can't mislead people, and we can't explain every single country for every given work.
- That said, there are a few things we can generally say:
- * Public Domain in Canada
- * Public Domain in the source country (if not a Canadian work)
- * Date of authors death (or if still alive), and whether that has reached Life+50 or Life+70
- In general there are really only two main terms: Life+50 and Life+70, then there is Life+60 (India + Venezuela), those handful of states that do less than Life+50 (Djibouti, Somalia, Yemen, Libya), and those that do more than Life+70 (around 10 countries).
- Canada simplifies things because basically everything becomes Life+50 (except for the four outlaw states), though this is no longer true for photographs.
- But my main point is that we cannot make statements that "works by this author are in the public domain in countries with laws that are life + XY years" (XY being a calculation based on this year minus death year minus). It is not correct in many instances.
- I think what would help would be a page that could be referred to regarding the law for the countries that Wikilivres editors are working most with, and a flowchart for each (on separate pages). Trying to cram that clarification into a footer doesn't work so well. --Jeffmcneill (talk) 16:19, 18 April 2017 (UTC)
- I realize this is a bigger issue and my ideas run counter to how things are done around here. I'd like to make this a project that interested Wikilivreans can discuss after I do more research. In the meantime I will take the suggestion of @Simon Peter Hughes: and look into other templates and/or creating alternatives. --Jeffmcneill (talk) 03:58, 19 April 2017 (UTC)
Proposal Change to Inclusion Policy
Hi folks, the discussion above became confused with what is public domain vs. the acceptable licensing. I'd like us to take on the second part first. I have a concrete proposal, as follows:
- Remove GFDL and ArtLibre licenses as acceptable (we can keep what we have)
- Add the following two licenses:
The main reasons are as follows:
- GFDL has some restrictions that could be problematic in practice (requiring some part of a document be reproduced, but the rest of it being able to be modified).
- ArtLibre is two-way compatible with CC-BY-SA, therefore redundant
- The ND and NC-ND are to allow works to be free in the sense of distribution, but not in the sense of derivatives, or derivatives + economic benefit. I do not see these as being incompatible with the Wikilivres project, since we are at core a distribution project. In this case, works that want to restrict derivatives, or derivatives and economic benefit, still offer free distribution. That should be encouraged rather than discouraged, in my thinking.
For the first five years of the project these licenses were acceptable.
Note that I do not recommend accepting CC-BY-NC-SA or CC-BY-NC. These two licences create the problem of orphan works as argued by Stallman. He makes a good case for avoiding licenses that restrict economic freedom, but do not restrict derivative works. The main point is that changes (and therefore authorship) can pile up on a work, and it is then effectively impossible to negotiate use rights that are more permissive. Works become *orphans* in regards to having no practically identifiable rights holder, while still being restricted in use, usually for 5 or more decades. Stallman's suggestion is to use ND when using NC, which means economic rights are a kind of derrivative work right, in practice. However, use of ND does not require NC, in the reverse situation.
Please your thoughts on this change. We would effectively be increasing what we would accept on Wikilivres. However, I do not see any kind of significant impact on this change, in terms of server resources. --Jeffmcneill (talk) 16:47, 22 April 2017 (UTC)
- comment -- in practical terms, most of our content on this project is likely to be PD, but "in principle" i would favour including/allowing the widest possible range of licenses that we can, within the parameters of our project "mission"/goals; so presumably open source is preferred, & the less user-restrictions the better. that maximises potential content; & end-users simply don't care about the administrative considerations for running the project, they just want access to content. --Lx 121 (talk) 16:10, 24 April 2017 (UTC)
More on Licensing and Compatability
As @Electron: notes, there are different, non-compatible versions of the Creative Commons license. I think there is no way around but to offer all of them as options. In general, most have the following versions: 1.0, 2.0, 2.5, 3.0, 4.0. Regarding Public Domain, there are basically three categories: Those works whose copyright has been waived by creators; those works whose copyright has expired; and those works which are ineligible for copyright (never were eligible for copyrighted in Canada). Creative Commons has two indicators for public domain, one which is the waiving, and the other which is merely a generic mark. However, jurisdiction is not included in the definition of the generic mark public domain, and therefore is only appropriate where a work is public domain in *all* jurisdictions. See: https://creativecommons.org/publicdomain/
To that end, Biblio.wiki does need (and generally provides) greater clarification. See below for basic text, which would then link (not currently) to descriptions of the licence:
Wikilivres.ca domain will DROP - Next Steps
Hi everyone. The registrar was informed by CIRA that when there is a death, the estate is to show the registrar the death certificate, and the registrar is to then cancel the registration, which means it enters the drop list. I've asked if we can just let the domain ride for now until it expires, and that there is likely going to be no action by the estate. I will let you know of their reply when I receive it.
Here is what will happen, most likely:
- Probably we can use the domain name until it expires, as I don't see the registrar doing anything without the death certificate.
- At some point (sooner or, at the latest, within two years) the domain will enter the drop list.
- There will most likely be up to 30 days when the domain would not be responsive at all (no records functioning).
- When the domain drops, then it will become available again, the best way of trying to get it is using a dropcatching service.
- Because domainers can see upcoming drops, there will likely be competition for a domain with 700+ domains linking to it, it becomes a bidding war.
There is no guarantee that we will be able to get wikilivres.ca when it drops, and it is only a matter of time when it does -- somewhere between today (not likely) and a little less than two years from now. The domain will be expensive to get when it does become available.
There is no other way around this that I can see, so we need to take action. I recommend we cut over to a new domain name as soon as possible, and start re-branding on it, get the interwiki links cut over, and start asking folks to rewrite their links to the new domain. We can have wikilivres.ca redirect to the new domain for as long as the registrar allows the current dns records (nameserver records) to exist (again, possibly up to 2 years).
So, we are back to figuring out what domain name to go with. We had more-or-less agreed on wikilivres.io, though it is not ideal (the .io confuses people, and that won't change anytime soon). I proposed publica.wiki, which dispenses with the old name. Another proposal was classics.wiki.Here was the reasoning:
- Wikilivres.io is a brand name that is shared with two other projects (wikilivres.ru, a spinoff of wikilivres.ca; and fr.wikibooks.org)
- The .io is confusing to people (especially as we are using Canadian copyright law)
- Another option is livres.wiki, which I think is a good use of the old name, and a generic tld (.wiki). This is probably the best for the old brand name, no confusion
- Publica.wiki would be a new brand name and essentially a proper name (it would have the most trademarkability), it has a sense of the public
- Classics.wiki is more generic, but says something about the content on the wiki
- I like Publica Wiki best. I think that it would be the name that the greatest number of speakers of different languages would find easy to pronounce. I think that the current name could be confusing to non-French speakers. Classics Wiki should also be easy to say and readily understandable to speakers of many different languages. I'm not sure that it really best summarizes our content, though. All works here are in the public domain but not all of them are widely regarded as classics. Simon Peter Hughes (talk) 05:24, 20 April 2017 (UTC)
comment -- can we get in touch with the estate/heirs? & just offer to buy it from them.
also; think we should at least "bookmark' it with good dropcatcher.
alsi, also; we definitely need to set up something, to stop this from ever happening again(!)
publica isn't bad; none of the options listed are terrible. but maybe we should play with ideas some more?
for example: does anyone know esperanto well enough to come up with a relevant word list?
or latin, etc.
P.S. -- (>__<)....!/? what a pain; i thought we had just got this sorted. :(
- I thought we paid for current name, so there is no rush. Now to the discussion. As I understand, we have two reasons for domain name changing: 1) to avoid confusion with other projects 2) the old one will expire in two years anyway. If we respect both reasons, we shoud drop all 4 names: publica, classics, wikilivres, livres. They are already used by other projects (websites). Esperanto libroj and spanish libros are taken too. For that path we need absolutely free, new names. I propose libros50, libroj50. We have to estimate then, can we buy the new name and all its variations, like libroj50.wiki and libroj50.com and libroj50.site and so on. Or else the problem of confusion will return sooner or later. If we cannot afford it and will consider only second reason, then... I don't know, maybe just to stay at wikilivres.io. The name publica has in russian the same word meaning audience. Classics is a bit misleading. Livres.wiki is not bad, also I like spanish Libros, esperanto Libroj. I am not insisting on anything, ready to see more names. -- VadimVMog (talk) 11:57, 20 April 2017 (UTC)
- I like Libros50. I think it would be better to go with the Spanish "libros" rather than the Esperanto "libroj" (pronounced: /leebroy/). I think that to most people who don't speak Esperanto, a word ending in a j would just look very, very strange.
- To say the truth the name wikilivres.ca is known by many people, so changing the name is very risky. If we have to change it would be less risky to change it to wikilivres.something, where something can be wiki, io or something else. The next name publica.wiki is also not bad. And what about just pd.wiki? It is very short and easy to remember. And is not occupied now. Electron ツ ➧☎ 22:51, 20 April 2017 (UTC)
- I agree about risk but we have no choice. The domain name was changed 5 years ago from wikilivres.info, and there are still a thousand links abandoned that never were rerouted. That's water under the bridge. In any case, pd.wiki is short and sweet, but it is therefore a premium, and costs $325 USD/year (at namecheap). --Jeffmcneill (talk) 00:06, 21 April 2017 (UTC)
- Well, for pd.wiki I've found an offer for $267.5 only (www.pananames.com), but anyway it is not very encouraging price... pd50.wiki is offered only for $6.88/year and retail $24.88/year (www.pananames.com) and pd50.org for $12.23. But they are not as sweetty and short as pd.wiki... Electron ツ ➧☎ 10:29, 21 April 2017 (UTC)
Thinking about it all day long. Thinking how to avoid damage. If we change the name early (right now), we can redirect from wikilivres.ca to a new name, right? Then for almost two years existing links will lead to a new name and ppl (at least most of them) will get used to a new name and when wikilivres.ca drops, they will migrate more easily. As for a new name, well, I like that pd50.wiki. It's short, it's neutral, it reflects the content, it's cheap. -- VadimVMog (talk) 13:48, 21 April 2017 (UTC)
- Would pdomain50.wiki be more understandable, or would it be too long? --Zephyrus (talk) 15:52, 21 April 2017 (UTC)
- For a name, it may be helpful to be more generic or to be "fanciful" which is the strongest argument for trademark. If laws change, then PD50 might not be accurate. Fanciful would need to be more than the category in which it operates (public domain/free creative works). In this case, Yann.wiki would be a stronger basis to build a brand that is recognizable. Currently, though too long I think the best characterization is:
- * Public Domain and Free creative works (text/images)
- * Under the more permissive (than in US/EU) Life+50 copyright duration
- * In library terms, this is a *special collection*, or rather, several special collections at a special (digital) library
- * In several different languages
- It is impossible to catch all of these facets in a short name, but it seems that (while we do have some competition in the Gutenberg Canada), the fact that we do early access in relation to L+70 copyright duration countries, that something that captures that meaning might be interesting:
- * early.wiki
- * dawncollection.org
- Just more ideas for the stew. --Jeffmcneill (talk) 17:02, 21 April 2017 (UTC)
- pd50 is not linked to the country, so the name stll will be good, if Canada abandons Life+50. There are other countries with Life+50. They all can't change their laws. If they do, our project will die with any name. Hope this will not happen. This is interesting discussion, I hope none of us has negative emotions. It's always interesting to discuss future, to draw plans. -- VadimVMog (talk) 18:41, 21 April 2017 (UTC)
- Just more ideas for the stew. --Jeffmcneill (talk) 17:02, 21 April 2017 (UTC)
- Next my proposals:
- * biblio.wiki - my favorite - short, easy to remember and can be understood in many languages; wide and voluminous; unused and cheap ($6.88/year, retail $24.88/year)
- * publio.wiki - as biblio.wiki
- * freebiblio.wiki, freepublio.wiki
- * freetexts.wiki, freebooks.wiki,
- * bookworm.wiki, bibliophile.wiki, reader.wiki
- Electron ツ ➧☎ 18:56, 21 April 2017 (UTC)
Let's make a list. So far we have:
Maybe it's time to choose? I'd like to have a choice -- not one name, but, say, four I like most. If everybody names 4 best, maybe we'll be able to name the best compromise. Do you agree or disagree with that? -- VadimVMog (talk) 20:56, 21 April 2017 (UTC)
- first: something.wikilivres.something is the only way for me to support it. Secondly: it is and it was not the way to discuss here important questions in some days only, this project is not that much frequently visited as others. I've just returned from a short journey and I must read some threads, but it is not OK to try to make quick decisions. Let us talk about it more than few days. -jkb- (talk) 22:31, 21 April 2017 (UTC)
question -- what if we used (carefully chosen) "sponsorship/advertising/whatever" -money to buy wikilivres.ca (&/or possibly wikilivres.info in the future)? question 2: will the price be as high for renewals, or is this just an "auction" for the initial rights? --Lx 121 (talk) 15:57, 24 April 2017 (UTC)
- It's got the wiki,
- It replaces livres (books, in French), with works (in English, more inclusive, as in works of art/works of artistic creation, which would include non-books such as photos, poems, musical compositions, etc.),
- and it has a generic tld (.org).
I'll filter out some names. My filters are:
1) to avoid current or future name collision to have a unique name
2) it must be cheap (not much to worry about, if a name is unique, it's usually cheap too, but not always)
3) it must sound at least not bad in all european languages, even better to sound good
4) to reflect the site content
5) to be not too long
I'm not including "a name must be neutral" as it is not imprtant for me or else I would prefer Esperanto, which I like. Now I'll try to filter out (the number is the reason to exclude):
- publica.wiki -- 3 to Russians
- classics.wiki -- 4
- wikilivres.io -- 1
- livres.wiki -- 1
- libros50.wiki (slightly 4)
- libroj50.wiki -- 3
- pd.wiki -- 2
- pd50.wiki (sounds pidififty) (slightly 4)
- pdomain50.wiki -- 4 (the word "domain" has in English 2 meanings as in "web names domain" and "public domain")
- yann.wiki -- 4
- early.wiki -- 4
- dawncollection.org -- 4, 5
- biblio.wiki -- 1
- freetexts.wiki -- 4
- freebooks.wiki (slightly 4)
- bookworm.wiki -- 3 (and slightly 4)
- bibliophile.wiki -- 3, 5
- reader.wiki -- 4
My choice is pd50.wiki and freebiblio.wiki.
Please, take no offend, this is just my preferences. Some people will filter differently. With my respect to every opinion. -- VadimVMog (talk) 04:55, 29 April 2017 (UTC) @Jeffmcneill:, @Lx 121:, @Simon Peter Hughes:, @-jkb-:, @Electron:
- Hi, Oh, bad news... My preference goes, in that order, to publica.wiki, biblio.wiki, wikiworks.org. Regards, Yann (talk) 10:22, 30 April 2017 (UTC)
biblio.wiki seems to have a lot of support
I've looked through these comments and it appears that biblio.wiki seems to have a lot of support. There are good aspects to this name:
- Short - Cheap (.wiki) - Has a sense of "library" (biblioteka, etc.) - Not a bad sound "Beeb Lee Ohh Wee Kee" / "Bihb Lee Ohh Wih Kee"
The only thing against it is the biblio.com antiquarian online bookseller. Nonetheless, from a trademark approach I think biblio.wiki is still viable, as follows:
- Biblio.com is a US online bookstore/marketplace
- The Biblio trademark is a service mark for Operating on-line marketplaces featuring used, rare and out of print books
If we always used the term Biblio.wiki / bibliowiki / BiblioWiki, etc., and never operated an online-marketplace for books, then there should not be any confusion (and certainly we are not in the same business).
For this reason I would like to move forward with this name. I'd prefer not to have a complicated voting process, but a straight up-or-down support/oppose/neutral, if that is ok? --Jeffmcneill (talk) 15:24, 1 May 2017 (UTC)
- ok, if it must be :-( ... biblio.wiki will get my support -jkb- (talk) 18:19, 1 May 2017 (UTC)
- I don't think there should an issue: we are not a market place (everything is free here ;oP ), and we are not a bookshop, we are an online library. Regards, Yann (talk) 19:24, 1 May 2017 (UTC)
- I'm OK with this name. -- VadimVMog (talk) 20:09, 1 May 2017 (UTC)
- OK for me too. --Zephyrus (talk) 00:48, 2 May 2017 (UTC)
- i'm fine with biblio.wiki as the new name; but i still think we should consider the question of whether we could pay for 'wikilivres.ca' out of sponsorship/selective-advertising/whatever revenues.
- also: 1. we really Really REALLY need to make sure that this problem never happens again.
- & 2. if we make biblio.wiki into a "successful" name, is that going to affect what it costs us to keep it in the future?
Hi folks, I've registerered biblio.wiki for 4 years (locking in the current discount for that amount of time). A few comments on some recent discussion:
- wikilivres.ca is not available for sale. It will go through the process of dropping and then likely a bidding on a drop-catch (see previous discussion). That said, there are yearly fees for domain names (and monthly fees for web hosting/backup/dns). I am covering the monthly costs, and along with donations from three Wikilivreans have paid for the new domain name.
- I agree that the same situation can easily take place (single point of failure). We don't necessarily need a formal organization to prevent this, but definitely sharing of the key account/login/email/password info with a select group. I suggest that during May we set up some minimal structure (that could grow as needed), which would include the key parts (below). Informally this would begin with giving @Yann: the account info.
- Domain registration account (login/password)
- Website hosting account (login/password)
- Operating system accounts (logins/passwords)
- In terms of the renaming/rebranding, I suggest that we look at 01 June as a cutover date. Communication should include what this means for the community (technically and naming), and how to communicate this to all visitors.
- Wikilivres.ca as a domain should work (as long as it works) for us, so while it is wise to take action and move on this now, that domain should still work (as a known and loved brand).
- A more formal organization (registered in Canada or another country) is not a bad idea, though there is paperwork and expense (yearly), even without any profits, and no benefit (other than in the event that we find ourselves in, there is a legal chain of command). (P.S., I have operating companies in the US and Thailand, including monthly accounting filings/yearly audits.) --Jeffmcneill (talk) 18:54, 7 May 2017 (UTC)
- Note, from what I can tell from Canadian govt sites, costs are as follows: Trademark $250. Incorporation $200 + $20/year. Cooperative $250 + $40. (Canadian dollars).
- How about we wait until next April, which would be the 12th anniversary of the site? That would give us a year on the new platform, and we could revisit it then? I realize several people have been on this site for many years, but I'm a newcomer, and it is still taking time to get everything working. At that point I can help with the process, and part of the cost, provided there is interest by other editors. Incorporation/Company Formation would provide us individual legal protection, a legal entity for ownership purposes, and more paperwork (with accounting, there is usually an audit requirement that we would have to hire an accounting firm to do, so also more expense). I suggest we pursue trademark only after company formation, if at all. --Jeffmcneill (talk) 03:12, 13 May 2017 (UTC)
Unique visitors to Wikilivres now 1,000+ Day
Moved to Wikilivres:Site_Analytics
subpages for discussions
Folks, we shall soon get problems to follow all discussions. I think that insuch a small (!) project as Wikilivres we should concentrate all improtant discussions on one single page. We have at the moment some five to eight active users so we don't need a special page for this and that like on enwiki with some ten village pump pages. I would suppose to discus the category stuff here and not on Wikilivres:Community Portal/Category Policy/En. as well as we do not need special subpages for English - Russian - French like supposed on Wikilivres:Community Portal/Category Policy - the discussion will get broken (as the experience shows even this main talk page Wikilivres:Community Portal/en exists in one language only and this is good). Cheers -jkb- (talk) 10:17, 24 April 2017 (UTC)
- Points taken. However, let's be clear about the intent here (though certainly the naming conventions can be changed). There are few if any policies written down, and those that exist are stretched over long periods of commentary in dozens of pages throughout the site. This is fine for those who have been here for a long time, but does nothing for new visitors. What is needed are a few very clear documents that lay out the functioning policies of this site. Otherwise what happens is that people violate these unstated policies, have their work deleted, and only then are things discussed or explained. On Wikisource there is on the Scriptorium a set of core policy documents linked at the top. The idea that Wikilivres can operate with a single policy document (the inclusion policy) is simply not realistic, nor kind to others.
- I am happy to have fewer places to discuss things, and also have voiced such requests. @-jxb-: please help clean up https://wikisource.org/wiki/Wikisource_talk:Wikilivres. There are several items with requests for feedback and comment that have very little to no response on this page, probably because one long page is very difficult to parse when it comes to keeping up and responding. Having discussions on wiki talk pages is difficult, as the technology does not fit the user needs. I understand perhaps folks have nothing to say, one way or the other. That's ok too.
- We do need some basic policies (which will be their own pages), and we can discuss these together, to whatever extent people want to be involved. Discussion can be on this page or those policy pages.
- There are two important issues currently, which are:
- * An improved copyright policy to be more inclusive, specifically with regards to CC-BY-NC and CC-BY-ND-NC (please provide feedback)
- * Some kind of consensus on a new domain name. I accept that it may not be possible to reach consensus.
- These two are significant, important, necessary changes. The first is to make a stronger claim and differentiate from the larger Wikisource and Gutenberg Canada (and get back to the roots of Wikilivres); and the second simply because there is a lot of work ahead to shift off of the current domain name, which we don't and can't control.
- It is important, at the very least, to document current practice regarding policy (such as categories, for example). Everyone is busy, believe me I get it. Documenting policy is meant to save time in the near and medium-term. If a few new pages of policy crop up, feel free to discuss or ignore, or discuss in the future. They are needed. The opportunity for discussion will be announced on this page in any case. --Jeffmcneill (talk) 19:10, 24 April 2017 (UTC)
support creating clear policy docs; but i will have no time to join the discussions until next weekend. have just finished uploading all the pd laura ingalls wilder 'little house' books, & that is it for my week here; i probably won't even have time to add more buck rogers comics until saturday. Lx 121 (talk) 19:25, 24 April 2017 (UTC)
- One point: there is just one and only Scriptorium on Wikisource for all community discussions. On the top are linked documents not talk pages - that's all right. But I think it is very dangerous to split the dommunity discussion to sevedral subpages (according the themes) and then more over to different language talk pages - aqnd do not forget: Wikisource is bigger than Wikilivres and it has much more users than we have. I work in such projects since 2004, Oldwikisource 2005.
- To the page Wikilivres:Community Portal/Category Policy/En etc.: if and when we have a new policy for categories (after discussing it here) we can create the page anyway, but in this case I'm not sure if we need something like that. Categories have their functionality on every WMF project (where we are coming from) and it is hard for me to imagine that we should have somedthing speciasl they do not have. More important is to create content in the right way (hm and sorry, the pages from today with some 300,000 bytes and more, not formated, they are not good...).
- Regards -jkb- (talk) 21:57, 24 April 2017 (UTC)
- comment -- & that is what we have "quality ratings" for. also; this is how collaborative-projects, like a wiki operate. S.P.H. & I are co-operating to add books. between us, we now have the complete works of ian fleming, & all available (all PD) works of laura ingalls wilder. 2 very important & popular writers in english. & we are the first & so far the only website that has their complete PD collections. Lx 121 (talk) 16:07, 26 April 2017 (UTC)
spam accounts -- RESOLVED
Spam accounts seem to have found us; beside the new antispam tools we should need sooner or later the check user extension which is quite useful here. @Jeffmcneill:, do you think we can have it some time? Cheers, -jkb- (talk) 19:47, 24 April 2017 (UTC)
- addendum -- we also need to draft a proper "block notice" to post @ blocked user accounts. Lx 121 (talk) 19:49, 24 April 2017 (UTC)
- Some of the old blocked users you can see here ->  or in Category:Blocked users. But most of these acconts are not register now... Maybe we should make an abolition, and delete the messages of be blocked because they are not registered now and not bloked actually. Electron ツ ➧☎ 22:41, 24 April 2017 (UTC)
.DJVU format drawbacks and alternatives
The .DJVU format is quite good for what it is intended, namely an image- and page-based archival format. However, there are significant problems when it is used on a wiki, as follows:
- When used to do manual optical character recognition (by a human typing) it is reasonable (with the proofread extension)
- However it is generally large
- The format is akin to ZIP which is read by MediaWiki, which then produces thumbnails for each page image, which dramatically inflates storage space
- While there are many applications that support .DJVU, it is not a popular format in the ebook marketplace, and requires software installation to use (none of the major ebook vendors have this as a standard file format: Amazon, Apple, Google, Kobo, Nook, Gutenberg, Archive.org (discontinued .DJVU file generation; none of the major hardware ebook readers support this format).
- The main problem is that it is not an ebook text file format, but rather a page-based, image-based file format (akin to pdf).
- Conversion from DJVU to epub or mobi is not possible, or generally successful (akin to PDF to epub/mobi).
As an example, Ernest Hemingway's Across the River and Into the Trees is 11.8mb in .djvu, add an addition to 185mb in thumbnails generated by MediaWiki. At the same time, .epub, .mobi, and .pdf files for this same work, combined are only 2.4mb.
Epub files are essentially html files (with css formating and images) and can fairly easily converted into wikitext (using tools such as Pandoc.
I understand the .djvu file editing workflow is one that has been around for a while. However, if epub files are available for a given work, those djvu files should be removed and replaced, for a dramatically improved user experience, for example Across_the_River_and_Into_the_Trees#Download. --Jeffmcneill (talk) 16:52, 2 May 2017 (UTC)
- comment -- tried to get pandoc working on my ubuntu system; nada, thus far. Lx 121 (talk) 07:47, 6 May 2017 (UTC)
- Jeffmcneill: DJVU files can't be replaced by epub files. These are different formats for different purposes. DJVU files usually contain images, while epub files usually contain text. DJVU compression can be defined on case by case, so DJVU can be very big for high quality (300dpi or higher) and high resolution, while it can be much smaller than the equivalent PDF files. Regards, Yann (talk) 18:48, 11 May 2017 (UTC)
- @Yann: yes I agree that these formats are for different purposes, and it is true that DJVU is a format which is much smaller than PDF for high quality page images. It is also true that DJVU cannot be converted into EPUB (page vs. image-based formats, as mentioned above). However, if there is an Epub file for a given work, and any images in that ebook were high quality, but all text was text (and not an image of text), then that would be desirable format to have, in terms of usability and in terms of server resources. I'm open to debate on this, but I just don't see any reason to keep a DJVU when Epub and PDF are available, given that the DJVU has image files of every page (and not searchable text) and the Epub and PDF files are actually mostly text, as in the example Hemingway mentioned above (Across_the_River_and_Into_the_Trees#Download). --Jeffmcneill (talk) 04:34, 12 May 2017 (UTC)
- @Yann: yes, I get that. But the DJVU files seem to never be removed. The Proofreading structure is kept in place. The Hemingway I mentioned was created in 2012 and split into Page: namespace by a bot. Maybe I just have a problem with the way the proofreading structure works. If one is reading this page: https://wikilivres.ca/wiki/Across_the_River_and_Into_the_Trees/Chapter_I and they try and click the Edit link, they get this:
<div class='text'><pages index="Hemingway - Across the River and Into the Trees.djvu" from=12 to=18 header=1 /></div>
- @Jeffmcneill: This is how it works on Wikisource. The text is transcluded from the Page namespace to the main namespace. I had a discussion with Thomas (Tpt) who maintains the Proofreadpage extension about this issue. He suggests to extend the VisualEditor to provide an easier edit interface for this. Regards, Yann (talk) 14:36, 17 May 2017 (UTC)
- @Lx 121: there are packages for Ubuntu and most Linux as well. May need to install LaTeX or even full TeX as well, for full PDF support. Even without that, though can test things like markdown to html, or wikitext to epub. I use CentOS so `yum search pandoc` shows several distributions. See also http://pandoc.org/installing.html which mentions an updated Debian distribution on their download page. --Jeffmcneill (talk) 04:40, 12 May 2017 (UTC)
- @Jeffmcneill: -- have done all of that, have installed everything i need to install (afaik), STILL doesn't work. im on ubuntu 16.04 lts, if you can rtell me what i need to dop or add or etc. or what thing i need to do to "summon" the program to appeaar, once it is installed, please help? xD because i've tried everything & GOT NOTHING for it >__<!?
- being able to convert into mediawiki format would be lovely, & it would solve most of my problems. mainly working with epubs atm. anything you can suggest?
comment -- on the subject of file formats, we should try to offer end-users the best variety we can. priorities come down to 2 or 3 things:
1. the qualities/merits of a particular file format -type; which is "best", & best @ what/how?
2. end-user demand; what file types real-world endusers need/want/use.
3. ease-of-use & compatibility with mediawiki; needs to work on our site, & must be easy & obvious for endusers to access(!).
First of all, thank you very much for revitalizing wikilivres.
I understand, there are problems/questions around the Project RE, and have created a short description as Category talk:Paulys Realencyclopädie der classischen Altertumswissenschaft (for further info please read User talk:K67y). To have here the page images referenced from german wikisource is essential for that digitalization project. Therefore, if resources is the main problem, any compromise offer would be an invaluable help, e.g. limiting the number of images, excluding them from the backup process etc. K67y (talk) 21:04, 5 May 2017 (UTC)
- This project has largely stalled, with a single user uploading many thousands of images years ago, and a few pages of text editing. This is not the appropriate place for that sort of project. Amazon S3 (with Canada locations) might be a better match, or other Canadian-located wiki projects. Trying to host gigabytes of page images on this site is definitely not recommended, and hardly justifiable. Between search engines trying to index these pages (both the images themselves and the image pages), and thumbnail generation of each and every page, our server became overloaded. This is just not a sustainable kind of cache of page images. --Jeffmcneill (talk) 04:24, 12 May 2017 (UTC)
Just to avoid the possibility of misunderstanding: does the above position mean, that the allowed number of RE-scans is zero and the allowed disk space is zero and this position of the person who pays for the site is not open to argumentation (although your formal concerns - if intended - could be addressed by resource usage limits, by disabling thumbnail generation, by excluding robots from these files/pages and so on) ? K67y (talk) 00:11, 15 May 2017 (UTC)
Dead Man's Switch
Hi folks, a simple mechanism to deal with the incapacity or demise of yours' truly, is a service that will send email in the event of my non-response (to emails). This service looks useful: https://www.deadmansswitch.net/ Since it sends email after 60 days, I will need to ensure that I pay ahead on the various services I use 90 days in advance. Also, I'll need to create separate accounts for these, which are mainly: Hosting, DNS, Domain Name. Thoughts? --Jeffmcneill (talk) 08:01, 12 May 2017 (UTC)
- comment -- well i hope you don't die on us! :/ but it's not only a matter of the technical aspects (being able to access the sysadmin controls, etc.). the are the legal "ownership" issues. in particular, who "owns" the rights to the name "biblio.wiki" (as regards domain name registration, & longer-term as "intellectual property (ugh!)). if i understand our present situation at all, the key problem we have RIGHT NOW is that eclectiology was the registered owner of the registered site name 'wikiliveres.ca', & that we have no way of transferring that "ownership" to anybody else, after his death. we need to fix that problem, as well as the technical issues of sysadmin/operations. --Lx 121 (talk) 10:10, 12 May 2017 (UTC)
- Wikilivres.ca is for all intents and purposes, not ours and will not be. It is only temporarily pointed at our nameservers, hence it currently works. However, there is nothing more we can do, and it is not worth anyones time to discuss it further. Hence the new domain name decision.
- Of the three main pieces -- domain name, hosting, and dns -- the domain name is the lynchpin, as everything else can be recreated/rebuilt. Though ideally you would not want to go through the process of rebuilding, believe me! So the hosting is also very important for practical reasons (configuration details, how things work). --Jeffmcneill (talk) 02:53, 13 May 2017 (UTC)
- There is free option there. Maybe we can go this way?
TwoFour recepients seems like enough? -- VadimVMog (talk) 18:30, 13 May 2017 (UTC)
Expense Report for last 60 days
I've worked up a brief spreadsheet for the expense report. There are not many items, basically hosting, dns, and domain name registration. There were donations from four members (including myself) during this time. Three members donated between $7.00 and $9.93 USD (they were asked to for the domain name registration fund). That totalled $24.52 USD. I have donated $39.57 USD to make up the difference, which was ~62% of the total expenses for the period, which were $64.09.
Some costs are monthly, and some are yearly or more. The domain biblio.wiki was registered for 4 years, to lock in the low current price on .wiki domains. Ongoing monthly costs should be between $10-15/month. I've switched over to a different DNS provider, partially to make it easier in case of a problem, but that will also save a few dollars per year. I've not included S3 storage/backup costs yet, as those are mixed in with some other domains, and I have to separate it out into a different S3 account.
All members are asked to donate what they can to help distribute the financial burden more evenly. Payments can now be made via paypal to email@example.com and all help is appreciated. I think it would be healthy if no one person paid more than half of the costs (which I am willing to do). This means that one additional donation of $7.52 and one more of $7.50 for the upcoming month of June, would be a nice gesture. Any amount is fine, of course, there is nothing magical about $7. Any regular visitors who have not yet contributed are encouraged to.
The file is located on Google Docs. Below is a snapshot.