• Àâòîðèçàöèÿ


David Humphrey: irc.mozilla.org rss_planet_mozilla 27-04-2019 05:51


Today I read Mike Hoye's blog post about Mozilla's IRC server coming to an end.  He writes:

Mozilla has relied on IRC as our main synchronous communications tool since the beginning...While we still use it heavily, IRC is an ongoing source of abuse and  harassment for many of our colleagues and getting connected to this now-obscure forum is an unnecessary technical barrier for anyone finding their way to Mozilla via the web.  

And, while "Mozilla intends to deprecate IRC," he goes on to say:

we definitely still need a globally-available, synchronous and text-first communication tool.

While I made dinner tonight, I thought back over my long history using Mozilla's IRC system, and tried to understand its place in my personal development within Mozilla and open source.

/invite

I remember the very first time I used IRC.  It was 2004, and earlier in the week I had met with Mike Shaver at Seneca, probably for the first time, and he'd ended our meeting with a phrase I'd never heard before, but I nodded knowingly nevertheless: "Ping me in #developers."

Ping me.  What on earth did that mean!? Little did I know that this phrase would come to signify so much about the next decade of my life.  After some research and initial trial and error, 'dave' joined irc.mozilla.org and found his way to the unlisted #developers channel.  And there was 'shaver', along with 300 or so other #developers.

The immediacy of it was unlike anything I'd used before (or since).  To join irc was to be transported somewhere else.  You weren't anywhere, or rather, you were simultaneously everywhere.  For many of these years I was connecting to irc from an old farm house in the middle of rural Ontario over a satellite internet connection.  But when I got online, there in the channels with me were people from New Zealand, the US, Sweden, and everywhere in between.

Possibly you've been on video calls with people from around the world, and felt something similar.  However, what was different from a video call, or teleconference, or any other medium I've used since, is that the time together didn't need to end.  You weren't meeting as such, and there wasn't a timebox or shared goal around your presence there.  Instead, you were working amongst one another, co-existing, listening, and most importantly for me, learning.

/join

Over the next year, irc went from being something I used here and there to something I used all the time.  I became 'humph' (one day Brendan confused me for Dave Herman, and shaver started calling me 'humph' to clarify) and have remained so ever since.  There are lots of people who have only ever called me 'humph' even to my face, which is hilarious and odd, but also very special.

Mike Beltzner taught me how to overcome one of the more difficult aspects of IRC: maintaining context after you log off.  Using screen and irssi I was able to start, leave, and then pick up conversations at a later time.  It's something you take for granted on Slack, but was critical to me being able to leverage IRC as a source of knowledge: if I asked a question, it might be hours before the person who could answer it would wake up and join irc from another part of the planet.

I became more engaged with different areas of the project.  IRC is siloed.  A given server is partitioned into many different channels, and each has its own sub-culture, appropriate topics, and community.  However, people typically participate in many channels.  As you get to know someone in one channel, you'll often hear more about the work happening in another.  Slowly I got invited into other channels and met more and more people across the Mozilla ecosystem.

Doing so took me places I hadn't anticipated.  For example, at some point I started chatting with people in #thunderbird, which led to me becoming an active contributor--I remember 'dascher' just started assigning me bugs to fix!  Another time I discovered the #static channel and a guy named 'taras' who was building crazy static analysis tools with gcc.  Without irc I can confidently say that I would have never started DXR, or worked on web audio, WebGL, all kinds of Firefox patches, or many of the other things I did.  I needed to be part of a community of peers and mentors for this work to be possible.

At a certain point I went from joining other channels to creating my own.  I started to build many communities within Mozilla to support new developers.  It was incredible to watch them fill up with a mix of experienced Mozilla contributors and people completely new to the project.  Over the years it helped to shape my

×èòàòü äàëåå...
êîììåíòàðèè: 0 ïîíðàâèëîñü! ââåðõ^ ê ïîëíîé âåðñèè
Robert O'Callahan: Update To rr Master To Debug Firefox Trunk rss_planet_mozilla 27-04-2019 03:42


A few days ago Firefox started using LMDB (via rkv) to store some startup info. LMDB relies on file descriptor I/O being coherent with memory-maps in a way that rr didn't support, so people have had trouble debugging Firefox in rr, and Pernosco's CI test failure reproducer also broke. We have checked in a fix to rr master and are in the process of updating the Pernosco pipeline.

The issue is that LMDB opens a file, maps it into memory MAP_SHARED, and then opens the file again and writes to it through the new file descriptor, and requires that the written data be immediately reflected in the shared memory mapping. (This behavior is not guaranteed by POSIX but is guaranteed by Linux.) rr needs to observe these writes and record the necessary memory changes, otherwise they won't happen during replay (because writes to files don't happen during replay) and replay will fail. rr already handled the case when the application write to the file descriptor (technically, the file description) that was used to map the file — Chromium has needed this for a while. The LMDB case is harder to handle. To fix LMDB, whenever the application opens a file for writing, we have to check to see if any shared mapping of that file exists and if so, mark that file description so writes to it have their shared-memory effects recorded. Unfortunately this adds overhead to writable file opens, but hopefully it doesn't matter much since in many workloads most file opens are read-only. (If it turns out to be a problem there are ways we can optimize further.) While fixing this, we also added support for the case where the application opens a file (possibly multiple times with different file descriptions) and then creates a shared mapping of one of them. To handle that, when creating a shared mapping we have to scan all open files to see if any of them refer to the mapped file, and if so, mark them so the effects of their writes are recorded.

Update Actually, at least this commit is required.

http://robert.ocallahan.org/2019/04/update-to-rr-master-to-debug-firefox.html

êîììåíòàðèè: 0 ïîíðàâèëîñü! ââåðõ^ ê ïîëíîé âåðñèè

Firefox Nightly: These Weeks in Firefox: Issue 58 rss_planet_mozilla 27-04-2019 00:33


https://blog.nightly.mozilla.org/2019/04/26/these-weeks-in-firefox-issue-58/

êîììåíòàðèè: 0 ïîíðàâèëîñü! ââåðõ^ ê ïîëíîé âåðñèè
Chris H-C: Firefox Origin Telemetry: Putting Prio in Practice rss_planet_mozilla 26-04-2019 22:32


Prio is neat. It allows us to learn counts of things that happen across the Firefox population without ever being able to learn which Firefox sent us which pieces of information.

For example, Content Blocking will soon be using this to count how often different trackers are blocked and exempted from blocking so we can more quickly roll our Enhanced Tracking Protection to our users to protect them from companies who want to track their activities across the Web.

To get from “Prio is neat” to “Content Blocking is using it” required a lot of effort and the design and implementation of a system I called Firefox Origin Telemetry.

Prio on its own has some very rough edges. It can only operate on a list of at most 2046 yes or no questions (a bit vector). It needs to know cryptographic keys from the servers that will be doing the sums and decryption. It needs to know what a “Batch ID” is. And it needs something to reliably and reasonably-frequently send the data once it has been encoded.

So how can we turn “tracker fb.com was blocked” into a bit in a bit vector into an encoded prio buffer into a network payload…

Firefox Origin Telemetry has two lists: a list of “origins” and a list of “metrics”. The list of origins is a list of where things happen. Did you block fb.com or google.com? Each of those trackers are “origins”. The list of metrics is a list of what happened. Did you block fb.com or did you have to exempt it from blocking because otherwise the site broke? Both “blocked” and “exempt” are “metrics”.

In this way Content Blocking can, whenever fb.com is blocked, call

Telemetry::RecordOrigin(OriginMetricID::ContentBlocking_Blocked, "fb.com");

And Firefox Origin Telemetry will take it from there.

Step 0 is in-memory storage. Firefox Origin Telemetry stores tables mapping from encoding id (ContentBlocking_Blocked) to tables of origins mapped to counts (“fb.com”: 1). If there’s any data in Firefox Origin Telemetry, you can view it in about:telemetry and it might look something like this:

originTelemetryAbout

Step 1 is App Encoding: turning “ContentBlocking_Blocked: {“fb.com”: 1}” into “bit twelve on shard 2 should be set to 1 for encoding ‘content-blocking-blocked’ ”

The full list of origins is too long to talk to Prio. So Firefox Origin Telemetry splits the list into 2046-element “shards”. The order of the origins list and the split locations for the shards must be stable and known ahead of time. When we change it in the future (either because Prio can start accepting larger or smaller buffers, or when the list of origins changes) we will have to change the name of the encoding from ‘content-blocking-blocked’ to maybe ‘content-blocking-blocked-v2’.

Step 2 is Prio Encoding: Firefox Origin Telemetry generates batch IDs of the encoding name suffixed with the shard number: for our example the batch ID is “content-blocking-blocked-1”. The server keys are communicated by Firefox Preferences (you can see them in about:config). With those pieces and the bit vector shards themselves, Prio has everything it needs to generate opaque binary blobs about 50 kilobytes in size.

Yeah, 2kb of data in a 50kb package. Not a small increase.

Step 3 is Base64 Encoding where we turn those 50kb binary blobs into 67kb strings of the letters a-z and A-Z, the numbers 0-9, and the symbols “+” or “/”. This is so we can send it in a normal Telemetry ping.

Step 4 is the “prio” ping. Once a day or when Firefox shuts down we need to send a ping containing these pairs of batch ids and base64-encoded strings plus a minimum amount of environmental data (Firefox version, current date, etc.), if there’s data to be sent. In the event that sending fails, we need to retry (TelemetrySend). After sending the ping should be available to be inspected for a period of time (TelemetryArchive).

…basically, this is where Telemetry does what Telemetry does best.

And then the ping becomes the problem of the servers who need to count and verify and sum and decode and… stuff. I dunno, I’m a Firefox Telemetry Engineer, not a Data Engineer. :amiyaguchi’s doing that part, not me : )

I’ve smoothed over some details here, but I hope I’ve given you an idea of what value Firefox Origin

×èòàòü äàëåå...
êîììåíòàðèè: 0 ïîíðàâèëîñü! ââåðõ^ ê ïîëíîé âåðñèè
Mike Hoye: Synchronous Text rss_planet_mozilla 26-04-2019 20:44


Envoy.

Let’s lead with the punchline: the question of what comes after IRC, for Mozilla, is now on my desk.

I wasn’t in the room when IRC.mozilla.org was stood up, but from what I’ve heard IRC wasn’t “chosen” so much as it was the obvious default, the only tool available in the late ’90s. Suffice to say that as a globally distributed organization, Mozilla has relied on IRC as our main synchronous communications tool since the beginning. For much of that time it’s served us well, if for some less-than-ideal values of “us” and “well”.

Like a lot of the early internet IRC is a quasi-standard protocol built with far more of the optimism of the time than the paranoia the infosec community now refers to as “common sense”, born before we learned how much easier it is to automate bad acts than it is to foster healthy communities. Like all unauthenticated systems on the modern net it’s aging badly and showing no signs of getting better.

While we still use it heavily, IRC is an ongoing source of abuse and harassment for many of our colleagues and getting connected to this now-obscure forum is an unnecessary technical barrier for anyone finding their way to Mozilla via the web. Available interfaces really haven’t kept up with modern expectations, spambots and harassment are endemic to the platform, and in light of that it’s no coincidence that people trying to get in touch with us from inside schools, colleges or corporate networks are finding that often as not IRC traffic isn’t allowed past institutional firewalls at all.

All of that adds up to a set of real hazards and unnecessary barriers to participation in the Mozilla project; we definitely still need a globally-available, synchronous and text-first communication tool; our commitment to working in the open as an organization hasn’t changed. But we’re setting a higher bar for ourselves and our communities now and IRC can’t meet that bar. We’ve come to the conclusion that for all IRC’s utility, it’s irresponsible of us to ask our people – employees, volunteers, partners or anyone else – to work in an environment that we can’t make sure is healthy, safe and productive.

In short, it’s no longer practical or responsible for us to keep that forum alive.

In the next small number of months, Mozilla intends to deprecate IRC as our primary synchronous-text communications platform, stand up a replacement and decommission irc.mozilla.org soon afterwards. I’m charged with leading that process on behalf of the organization.

Very soon, I’ll be setting up the evaluation process for a couple of candidate replacement stacks. We’re lucky; we’re spoiled for good options these days. I’ll talk a bit more about them in a future post, but the broad strokes of our requirements are pretty straightforward:

  • We are not rolling our own. Whether we host it ourselves or pay for a service, we’re getting something off the shelf that best meets our needs.
  • It needs to be accessible to the greater Mozilla community.
  • We are evaluating products, not protocols.
  • We aren’t picking an outlier; whatever stack we choose needs to be a modern, proven service that seems to have a solid provenance and a good life ahead of it. We’re not moving from one idiosyncratic outlier stack to another idiosyncratic outlier stack.
  • While we’re investigating options for semi-anonymous or pseudonymous connections, we will require authentication, because:
  • The Mozilla Community Participation Guidelines will apply, and they’ll be enforced.

I found this at the top of a draft FAQ I’d started putting together a while back. It might not be what you’d call “complete”, but maybe it is:

Q: Why are we moving away from IRC? IRC is fine!
A: IRC is not fine.

Q: Seriously? You’re kidding, right?
A: I’m dead serious.

I don’t do blog comments anymore – unfortunately, for a lot of the same reasons I’m dealing with this – but if you’ve got questions, you can email me.

Or, if you like, you can find me on IRC.

http://exple.tive.org/blarg/2019/04/26/synchronous-text/

êîììåíòàðèè: 1 ïîíðàâèëîñü! ââåðõ^ ê ïîëíîé âåðñèè
Christopher Arnold: An Author-Optimized Social Network Approach rss_planet_mozilla 26-04-2019 06:26


https://ncubeeight.blogspot.com/2019/04/an-author-optimized-social-network.html

êîììåíòàðèè: 0 ïîíðàâèëîñü! ââåðõ^ ê ïîëíîé âåðñèè
The Mozilla Blog: Firefox and Emerging Markets Leadership rss_planet_mozilla 25-04-2019 23:05


Building on the success of Firefox Quantum, we have a renewed focus on better enabling people to take control of their internet-connected lives as their trusted personal agent — through continued evolution of the browser and web platform — and with new products and services that provide enhanced security, privacy and user agency across connected life.

To accelerate this work, we’re announcing some changes to our senior leadership team:

Dave Camp has been appointed SVP Firefox. In this new role, Dave will be responsible for overall Firefox product and web platform development.

As a long time Mozillian, Dave joined Mozilla in 2006 to work on Gecko, building networking and security features and was a contributor to the release of Firefox 3. After a short stint at a startup he rejoined Mozilla in 2011 as part of the Firefox Developer Tools team. Dave has since served in a variety of senior leadership roles within the Firefox product organization, most recently leading the Firefox engineering team through the launch of Firefox Quantum.

Under Dave’s leadership the new Firefox organization will pull together all product management, engineering, technology and operations in support of our Firefox products, services and web platform. As part of this change, we are also announcing the promotion of Marissa (Reese) Wood to VP Firefox Product Management, and Joe Hildebrand to VP Firefox Engineering. Both Joe and Reese have been key drivers of the continued development of our core browser across platforms, and the expansion of the Firefox portfolio of products and services globally.

In addition, we are increasing our investment and focus in emerging markets, building on the early success of products like Firefox Lite which we launched in India earlier this year, we are also formally establishing an emerging markets team based in Taipei:

Stan Leong appointed as VP and General Manager, Emerging Markets. In this new role, Stan will be responsible for our product development and go-to-market strategy for the region. Stan joins us from DCX Technology where he was Global Head of Emerging Product Engineering. He has a great combination of start-up and large company experience having spent years at Hewlett Packard, and he has worked extensively in the Asian markets.

As part of this, Mark Mayo, who has served as our Chief Product Officer (CPO), will move into a new role focused on strategic product development initiatives with an initial emphasis on accelerating our emerging markets strategy. We will be conducting an executive search for a CPO to lead the ongoing development and evolution of our global product portfolio.

I’m confident that with these changes, we are well positioned to continue the evolution of the browser and web platform and introduce new products and services that provide enhanced security, privacy and user agency across connected life.

The post Firefox and Emerging Markets Leadership appeared first on The Mozilla Blog.

https://blog.mozilla.org/blog/2019/04/25/firefox-and-emerging-markets-leadership/

êîììåíòàðèè: 0 ïîíðàâèëîñü! ââåðõ^ ê ïîëíîé âåðñèè
Nathan Froyd: an unexpected benefit of standardizing on clang-cl rss_planet_mozilla 25-04-2019 19:45


I wrote several months ago about our impending decision to switch to clang-cl on Windows.  In the intervening months, we did that, and we also dropped MSVC as a supported compiler.  (We still build on Linux with GCC, and will probably continue to do that for some time.)  One (extremely welcome) consequence of the switch to clang-cl has only become clear to me in the past couple of weeks: using assembly language across platforms is no longer painful.

First, a little bit of background: GCC (and Clang) support a feature called inline assembly, which enables you to write little snippets of assembly code directly in your C/C++ program.  The syntax is baroque, it’s incredibly easy to shoot yourself in the foot with it, and it’s incredibly useful for a variety of low-level things.  MSVC supports inline assembly as well, but only on x86, and with a completely different syntax than GCC.

OK, so maybe you want to put your code in a separate assembly file instead.  The complementary assembler for GCC (courtesy of binutils) is called gas, with its own specific syntax for various low-level details.  If you give gcc an assembly file, it knows to pass it directly to gas, and will even run the C preprocessor on the assembly before invoking gas if you request that.  So you only ever need to invoke gcc to compile everything, and the right thing will just happen. MSVC, by contrast, requires you to invoke a separate, differently-named assembler for each architecture, with different assembly language syntaxes (e.g. directives for the x86-64 assembler are quite different than the arm64 assembler), and preprocessing files beforehand requires you to jump through hoops.  (To be fair, a number of these details are handled for you if you’re building from inside Visual Studio; the differences are only annoying to handle in cross-platform build systems.)

In short, dealing with assembler in a world where you have to support MSVC is somewhat painful.  You have to copy-and-paste code, or maybe you write Perl scripts to translate from the gas syntax to whatever flavor of syntax the Microsoft assembler you’re using is.  Your build system needs to handle Windows and non-Windows differently for assembly files, and may even need to handle different architectures for Windows differently.  Things like our ICU data generation have been made somewhat more complex than necessary to support Windows platforms.

Enter clang-cl.  Since clang-cl is just clang under the hood, it handles being passed assembly files on the command line in the same way and will even preprocess them for you.  Additionally, clang-cl contains a gas-compatible assembly syntax parser, so assembly files that you pass on the command line are parsed by clang-cl and therefore you can now write a single assembly syntax that works on Unix-y and Windows platforms.  (You do, of course, have to handle differing platform calling conventions and the like, but that’s simplified by having a preprocessor available all the time.)  Finally, clang-cl supports GCC-style inline assembly, so you don’t even have to drop into separate assembly files if you don’t want to.

In short, clang-cl solves every problem that made assembly usage painful on Windows. Might we have a future world where open source projects that have to deal with any amount of assembly standardize on clang-cl for their Windows support, and declare MSVC unsupported?

https://blog.mozilla.org/nfroyd/2019/04/25/an-unexpected-benefit-of-standardizing-on-clang-cl/

êîììåíòàðèè: 0 ïîíðàâèëîñü! ââåðõ^ ê ïîëíîé âåðñèè
Mike Conley: Firefox Front-End Performance Update #17 rss_planet_mozilla 25-04-2019 00:52


https://mikeconley.ca/blog/2019/04/24/firefox-front-end-performance-update-17/

êîììåíòàðèè: 0 ïîíðàâèëîñü! ââåðõ^ ê ïîëíîé âåðñèè
Daniel Stenberg: Why they use curl rss_planet_mozilla 25-04-2019 00:41


As a reader of my blog you know curl. You also most probably already know why you would use curl and if I’m right, you’re also a fan of using the right tool for the job. But do you know why others use curl and why they switch from other solutions to relying on curl for their current and future data transfers? Let me tell you the top reasons I’m told by users.

Logging and exact error handling

What exactly happened in the transfer and why are terribly important questions to some users, and with curl you have the tools to figure that out and also be sure that curl either returns failure or the command worked. This clear and binary distinction is important to users for whom that single file every transfer is important. For example, some of the largest and most well-known banks in the world use curl in their back-ends where each file transfer can mean a transfer of extremely large sums of money.

A few years ago I helped a money transaction service switch to curl to get that exact line in the sand figured out. To know exactly and with certainty if money had been transferred – or not – for a given operation. Vital for their business.

curl does not have the browsers’ lenient approach of “anything goes as long as we get something to show” when it comes to the Internet protocols.

Verbose goodness

curl’s verbose output options allow users to see exactly what curl sends and receives in a quick and non-complicated way. This is invaluable for developers to figure out what’s happening and what’s wrong, in either end involved in the data transfer.

curl’s verbose options allows developers to see all sent and received data even when encryption is used. And if that is not enough, its SSLKEYLOGFILE support allows you to take it to the next level when you need to!

Same behavior over time

Users sometimes upgrade their curl installations after several years of not having done so. Bumping a software’s version after several years and many releases, any software really, can be a bit of a journey and adventure many times as things have changed, behavior is different and things that previously worked no longer do etc.

With curl however, you can upgrade to a version that is a decade newer, with lots of new fancy features and old crummy bugs fixed, only to see that everything that used to work back in the day still works – the same way. With curl, you can be sure that there’s an enormous focus on maintaining old functionality when going forward.

Present on all platforms

The fact that curl is highly portable, our users can have curl and use curl on just about any platform you can think of and use it with the same options and behaviors across them all. Learn curl on one platform, then continue to use it the same way on the next system. Platforms and their individual popularity vary over time and we enjoy to allow users to pick the ones they like – and you can be sure that curl will run on them all.

Performance

When doing the occasional file transfer every once in a while, raw transfer performance doesn’t matter much. Most of the time will then just be waiting on network anyway. You can easily get away with your Python and java frameworks’ multiple levels of overhead and excessive memory consumption.

Users who scan the Internet or otherwise perform many thousands of transfers per second from a large number of threads and machines realize that they need fewer machines that spend less CPU time if they build their file transfer solutions on top of curl. In curl we have a focus on only doing what’s required and it’s a lean and trimmed solution with a well-documented API built purely for Internet data transfers.

The features you want

The author of a banking application recently explained for us that one of the top reasons why they switched to using curl for doing their Internet data transfers, is curl’s ability to keep the file name from the URL.

curl is a feature-packed tool and library that most likely already support the protocols you need and provide the power features you want. With a healthy amount of “extension points” where you can extend it or hook in your custom extra solution.

Support and documentation

No other tool or library for internet transfers have even close to the same amount of documentation, examples available on the net, existing user base that can help out and friendly users to support you when you run into issues. Ask questions on the mailing lists, post a bug on the bug tracker or even show your non-working code on stackoverflow to further your project.

curl is really the

×èòàòü äàëåå...
êîììåíòàðèè: 0 ïîíðàâèëîñü! ââåðõ^ ê ïîëíîé âåðñèè
Henri Sivonen: It’s Time to Stop Adding New Features for Non-Unicode Execution Encodings in C++ rss_planet_mozilla 24-04-2019 20:09


https://hsivonen.fi/non-unicode-in-cpp/

êîììåíòàðèè: 0 ïîíðàâèëîñü! ââåðõ^ ê ïîëíîé âåðñèè
Ryan Harter: When the Bootstrap Breaks - ODSC 2019 rss_planet_mozilla 24-04-2019 10:00


I'm excited to announce that I'll be presenting at the Open Data Science Conference in Boston next week. My colleague Saptarshi and I will be talking about When the Bootstrap Breaks.

I've included the abstract below, but the high-level goal of this talk is to strip some varnish off the bootstrap. Folks often look to the bootstrap as a panacea for weird data, but all tools have their failure cases. We plan on highlighting some problems we ran into when trying to use the bootstrap for Firefox data and how we dealt with the issues, both in theory and in practice.

Abstract:

Resampling methods like the bootstrap are becoming increasingly common in modern data science. For good reason too; the bootstrap is incredibly powerful. Unlike t-statistics, the bootstrap doesn’t depend on a normality assumption nor require any arcane formulas. You’re no longer limited to working with well understood metrics like means. One can easily build tools that compute confidence for an arbitrary metric. What’s the standard error of a Median? Who cares! I used the bootstrap.

With all of these benefits the bootstrap begins to look a little magical. That’s dangerous. To understand your tool you need to understand how it fails, how to spot the failure, and what to do when it does. As it turns out, methods like the bootstrap and the t-test struggle with very similar types of data. We’ll explore how these two methods compare on troublesome data sets and discuss when to use one over the other.

In this talk we’ll explore what types to data the bootstrap has trouble with. Then we’ll discuss how to identify these problems in the wild and how to deal with the problematic data. We will explore simulated data and share the code to conduct the simulations yourself. However, this isn’t just a theoretical problem. We’ll also explore real Firefox data and discuss how Firefox’s data science team handles this data when analyzing experiments.

At the end of this session you’ll leave with a firm understanding of the bootstrap. Even better, you’ll understand how to spot potential issues in your data and avoid false confidence in your results.

https://blog.harterrt.com/odsc-2019.html

êîììåíòàðèè: 0 ïîíðàâèëîñü! ââåðõ^ ê ïîëíîé âåðñèè
The Mozilla Blog: It’s Complicated: Mozilla’s 2019 Internet Health Report rss_planet_mozilla 24-04-2019 06:00


Our annual open-source report examines how humanity and the internet intersect. Here’s what we found

 

Today, Mozilla is publishing the 2019 Internet Health Report — our third annual examination of the internet, its impact on society and how it influences our everyday lives.

http://internethealthreport.org/2019/

The Report paints a mixed picture of what life online looks like today. We’re more connected than ever, with humanity passing the ‘50% of us are now online’ mark earlier this year. And, while almost all of us enjoy the upsides of being connected, we also worry about how the internet and social media are impacting our children, our jobs and our democracies.

When we published last year’s Report, the world was watching the Facebook-Cambridge Analytica scandal unfold — and these worries were starting to grow. Millions of people were realizing that widespread, laissez-faire sharing of our personal data, the massive growth and centralization of the tech industry, and the misuse of online ads and social media was adding up to a big mess.

Over the past year, more and more people started asking: what are we going to do about this mess? How do we push the digital world in a better direction?

As people asked these questions, our ability to see the underlying problems with the system — and to imagine solutions — has evolved tremendously. Recently, we’ve seen governments across Europe step up efforts to monitor and thwart disinformation ahead of the upcoming EU elections. We’ve seen the big tech companies try everything from making ads more transparent to improving content recommendation algorithms to setting up ethics boards (albeit with limited effect and with critics saying ‘you need to do much more!’). And, we’ve seen CEOs and policymakers and activists wrestling with each other over where to go next. We have not ‘fixed’ the problems, but it does feel like we’ve entered a new, sustained era of debate about what a healthy digital society should look like.

The 2019 Internet Health Report examines the story behind these stories, using interviews with experts, data analysis and visualization, and original reporting. It was also built with input from you, the reader: In 2018, we asked readers what issues they wanted to see in the next Report.

In the Report’s three spotlight articles, we unpack three big issues: One examines the need for better machine decision making — that is, asking questions like Who designs the algorithms? and What data do they feed on? and Who is being discriminated against? Another examines ways to rethink the ad economy, so surveillance and addiction are no longer design necessities.  The third spotlight article examines the rise of smart cities, and how local governments can integrate tech in a way that serves the public good, not commercial interests.

Of course, the Report isn’t limited to just three topics. Other highlights include articles on the threat of deepfakes, the potential of user-owned social media platforms, pornography literacy initiatives, investment in undersea cables, and the dangers of sharing DNA results online.

So, what’s our conclusion? How healthy is the internet right now? It’s complicated — the digital environment is a complex ecosystem, just like the planet we live on. There have been a number of positive trends in the past year that show that the internet — and our relationship with it — is getting healthier:

Calls for privacy are becoming mainstream. The last year brought a tectonic shift in public awareness about privacy and security in the digital world, in great part due to the Cambridge Analytica scandal. That awareness is continuing to grow — and also translate into action. European regulators, with help from civil society watchdogs and individual internet users, are enforcing the GDPR: In recent months, Google has been fined ˆ50 million for GDPR violations in France, and tens of thousands of violation complaints have been filed across the continent.

There’s a movement to build more responsible AI. As the flaws with today’s AI become more apparent, technologists and activists are speaking

×èòàòü äàëåå...
êîììåíòàðèè: 0 ïîíðàâèëîñü! ââåðõ^ ê ïîëíîé âåðñèè
Mark Surman: Why AI + consumer tech? rss_planet_mozilla 23-04-2019 23:41


In my last post, I shared some early thoughts on how Mozilla is thinking about AI as part of our overall internet health agenda. I noted in that post that we’re leaning towards consumer tech as the focus and backdrop for whatever goals we take on in AI. In our draft issue brief we say:

Mozilla is particularly interested in how automated decision making is being used in consumer products and services. We want to make sure the interests of all users are designed into these products and services. Where they aren’t, we want to call that out.

After talking to nearly 100 AI experts and activists, this consumer tech focus feels right for Mozilla. But it also raises a number of questions: what do we mean by consumer tech? What is in scope for this work? And what is not? Are we missing something important with this focus?

At its simplest, the consumer tech platforms that we are talking about are general purpose internet products and services aimed at a wide audience for personal use. These include things like social networks, search engines, retail e-commerce, home assistants, computers, smartphones, fitness trackers, self-driving cars, etc. — almost all of which are connected to the internet and are fueled by our personal data. The leading players in these areas are companies like Google, Amazon, Facebook, Microsoft and Apple in the US as well as companies like Baidu, TenCent, and AliBaba in China. These companies are also amongst the biggest developers and users of AI, setting trends and shipping technology that shapes the whole of the tech industry. And, as long as we remain in the era of machine learning, these companies have a disproportionate advantage in AI development as they control huge amounts for data and computing power that can be used to train automated systems.

Given the power of the big tech companies in shaping the AI agenda — and the growing pervasiveness of automated decision making in the tech we all use everyday — we believe we need to set a higher bar for the development, use and impact of AI in consumer products and services. We need a way to reward companies who reach that bar. And push back and hold to account those who do not.

Of course, AI isn’t bad or good on it’s own — it is just another tool in the toolbox of computer engineering. Benefits, harms and side effects come from how systems are designed, what data is selected to train them and what business rules they are given. For example, search for ‘doctor’ on Google, you mostly see white doctors because that bias is in the training data. Similarly, content algorithms on sites like YouTube often recommend increasingly extreme content because the main business rule they are optimized for is to keep people on the site or app for as long as possible. Humans — and the companies they work in — can avoid or fix problems like these. Helping them do so is important work. It’s worth doing.

Of course, there are important issues related to AI and the health of the internet that go beyond consumer technology. The use of biased facial recognition software by police and immigration authorities. Similarly biased and unfair resume sorting algorithms used by human resource departments as part of hiring processes. The use of AI by the military to automate and add precision to killing from a distance. Ensuring that human rights and dignity are protected as the use of machine decision making grows within government and the back offices of big business is critical. Luckily, there is an amazing crew of organizations stepping up to address these issues such as AI Now in the US and Algorithm Watch in Europe. Mozilla taking a lead in these areas wouldn’t add much. Here, we should play a supporting role.

In contrast, there are few players focused squarely on how AI is showing up in consumer products and services. Yet this is one of the places where the power and the impact of AI is moving most rapidly. Also, importantly, consumer tech is the field on which Mozilla has always played. As we try to shape where AI is headed, it makes sense to do so here. We’re already doing so in small ways with technology, showing a more open way to approach machine learning with projects like Deep Speech and Common Voice. However, we have a chance to do much more by using our community, brand and credibility to push the development and use of AI in consumer tech in the right direction. We might do this as a

×èòàòü äàëåå...
êîììåíòàðèè: 0 ïîíðàâèëîñü! ââåðõ^ ê ïîëíîé âåðñèè
The Firefox Frontier: 5 times when video ads autoplay and ruin everything. rss_planet_mozilla 23-04-2019 18:42


The room is dark and silent. Suddenly, a loud noise pierces your ears. You panic as everyone turns in your direction. You just wanted to read an article about cute … Read more

The post 5 times when video ads autoplay and ruin everything. appeared first on The Firefox Frontier.

https://blog.mozilla.org/firefox/5-times-when-video-ads-autoplay-and-ruin-everything/

êîììåíòàðèè: 0 ïîíðàâèëîñü! ââåðõ^ ê ïîëíîé âåðñèè
Ian Bicking: “Users want control” is a shoulder shrug rss_planet_mozilla 23-04-2019 08:00


Making the claim “users want control” is the same as saying you don’t know what users want, you don’t know what is good, and you don’t know what their goals are.

I first started thinking about this during the debate over what would become the ACA. The rhetoric was filled with this idea that people want choice in their medical care: people want control.

No! People want good health care. If they don’t trust systems to provide them good health care, if they don’t trust their providers to understand their priorities, then choice is the fallback: it’s how you work the system when the system isn’t working for you. And it sucks! Here you are, in the middle of some health issue, with treatments and symptoms and the rest of your life duties, and now you have to become a researcher on top of it? But the politicians and the pundits could not stop talking about control.

Control is what you need when you want something and it won’t happen on its own. But (usually) it’s not control you want, it’s just a means.

So when we say users want control over X – their privacy, their security, their data, their history – we are first acknowledging that current systems act against users, but we aren’t proposing any real solution. We’re avoiding even talking about the problems.

For instance, we say “users want control over their privacy,” but what people really want is some subset of:

  1. To avoid embarrassment
  2. To avoid persecution
  3. … sometimes for doing illegal and wrong things
  4. To keep from having the creeping sensation they left something sitting out that they didn’t want to
  5. They want to make some political statement against surveillance
  6. They want to keep things from the prying eyes of those close to them
  7. They want to avoid being manipulated by bad-faith messaging

There’s no easy answers, not everyone holds all these desires, but these are concrete ways of thinking about what people want. They don’t all point in the same direction. (And then consider the complex implications of someone else talking about you!)

There are some cases when a person really does want control. If the person wants to determine their own path, if having choice is itself a personal goal, then you need control. That’s a goal about who you are not just what you get. It’s worth identifying moments when this is important. But if a person does not pay attention to something then that person probably does not identify with the topic and is not seeking control over it. “Privacy advocates” pay attention to privacy, and attain a sense of identity from the very act of being mindful of their own privacy. Everyone else does not.

Let’s think about another example: users want control over their data. What are some things they want?

  1. They don’t want to lose their data
  2. They don’t want their data used to hold them hostage (e.g., to a subscription service)
  3. They don’t want to delete data and have it still reappear
  4. They want to use their data however they want, but more likely they want their data available for use by some other service or tool
  5. They feel it’s unfair if their data is used for commercial purposes without any compensation
  6. They are offended if their data is used to manipulate themselves or others
  7. They don’t want their data used against them in manipulative ways
  8. They want to have shared ownership of data with other people
  9. They want to prevent unauthorized or malicious access to their data

Again these motivations are often against each other. A person wants to be able to copy their data between services, but also delete their data permanently and completely. People don’t want to lose their data, but having personal control over your data is a great way to lose it, or even to lose control over it. The professionalization and centralization of data management by services has mostly improved access control and reliability.

When we simply say users want control it’s giving up on understanding people’s specific desires. Still it’s not exactly wrong: it’s reasonable to assume people will use control to achieve their desires. But if, as technologists, we can’t map functionality to desire, it’s a bit of a stretch to imagine everyone else will figure it out on the fly.

https://www.ianbicking.org/blog/2019/04/users-want-control-is-a-shrug.html

êîììåíòàðèè: 0 ïîíðàâèëîñü! ââåðõ^ ê ïîëíîé âåðñèè
David Humphrey: Teaching Open Source: Sept 2018 - April 2019 rss_planet_mozilla 23-04-2019 05:31


https://blog.humphd.org/teaching-open-source-sept-2018-april/

êîììåíòàðèè: 0 ïîíðàâèëîñü! ââåðõ^ ê ïîëíîé âåðñèè
Mozilla VR Blog: VoxelJS: Chunking Magic rss_planet_mozilla 23-04-2019 02:05


VoxelJS: Chunking Magic

A couple of weeks ago I relaunched VoxelJS with modern ThreeJS and modules support. Today I'm going to talk a little bit about how VoxelJS works internally, specifically how voxels are represented and drawn. This is the key magic part of a voxel engine and I owe a tremendous debt to Max Ogden, James Halliday and Mikola Lysenko

Voxels are represented by numbers in a large three dimensional array. Each number says what type of block goes in that block slot, with 0 representing empty. The challenge is how to represent a potentially infinite set of voxels without slowing the computer to a crawl. The only way to do this is to load just a portion of the world at a time.

The Challenge

We could do this by generating a large structure, say 512 x 512 x 512 blocks, then setting entries in the array whenever the player creates or destroys a block.

This process would work, but we have a problem. A grid of numbers is great for representing voxels, but we don’t have a way to draw them. ThreeJS (and by extension, WebGL) only thinks in terms of polygons. So we need a process to turn the voxel grid in to a polygon mesh. This process is called meshing.

The simple way to do meshing would be to create a cube object for every voxel in the original data. This would work but it would be incredibly slow. Plus most cubes are inside terrain and would never be seen, but they would still take up memory in the GPU and have to be culled from the view. Clearly we need to find a way to create more efficient meshes.

Even if we have a better way to do the meshing we still have another problem. When the player creates or destroys a block we need to update the data. Updating a single entry in an array is easy, but then we have to recreate the entire polygon mesh. That’s going to be incredibly slow if we must rebuild the entire mesh of the entire world every time the player creates a block, which will happen a lot in a game like Minecraft. VoxelJS has a solution to both of our problems. : chunks.

Chunks

The world is divided into large cubes called chunks, each composed of a uniform set of blocks. By default each chunk is 16x16x16 blocks, though you can adjust this depending on your need. Each chunk is represented by an array of block types, just as before, and when any block in the chunk changes the entire chunk transformed into a new polygon mesh and sent to the GPU. Because the chunk is so much smaller than before this shouldn’t take very long. In practice it can be done in less than a few milliseconds on a laptop. Only the single chunk which changed needs to be updated, the rest can stay in GPU memory and just be redrawn every frame (which is something modern GPUs excel at).

As the player moves around the world VoxelJS maintains a list of chunks that are near by. This is defined as chunks within a certain radius of the player (also configurable). As chunks go out of the nearby list they are deleted and new chunks are created. Thus at any given time only a small number of chunks are in memory.

This chunking system works pretty well with the naive algorithm that creates a cube for every voxel, but it still ends up using a tremendous amount of GPU memory and can be slow to draw, especially on mobile GPUs. We can do better.

Meshing

The first optimization is to only create cube geometry for voxels that are actually visible from the outside. All internal voxels can be culled. In VoxelJS this is implemented by the CullingMesher in the main src directory.

Culling works a lot better than naive meshing, but we could still do better. If there is a 10 x 10 wall it would be represented by 100 cubes (then multiplied another 12 when turned into triangles); yet in theory the wall could be drawn with a single box object.

To handle this case VoxelJS has another algorithm called the GreedyMesher. This produces vastly more efficient meshes and takes only slightly longer to compute than the culling mesh.

There is a downside to greedy meshing, however. Texture mapping is very easy on cubes, especially when using a texture atlas. The same with

×èòàòü äàëåå...
êîììåíòàðèè: 0 ïîíðàâèëîñü! ââåðõ^ ê ïîëíîé âåðñèè
Mozilla B-Team: happy bmo push day! rss_planet_mozilla 22-04-2019 18:59


Bugfixes + enabling the new security feature for API keys.

release tag

the following changes have been pushed to bugzilla.mozilla.org:

  • [1541303] Default component bug type is not set as expected; enhancement severity is still used for existing bugs
  • [1543760] When cloning a bug, the bug is added to ‘Regressed by’ of the new bug
  • [1543718] Obsolete attachments should have a strikethrough
  • [1543798] Do not treat email addresses with invalid.bugs as unassigned…

View On WordPress

https://mozilla-bteam.tumblr.com/post/184367688188

êîììåíòàðèè: 0 ïîíðàâèëîñü! ââåðõ^ ê ïîëíîé âåðñèè
Daniel Stenberg: curl + hackerone = TRUE rss_planet_mozilla 22-04-2019 18:03


There seems to be no end to updated posts about bug bounties in the curl project these days. Not long ago I mentioned the then new program that sadly enough was cancelled only a few months after its birth.

Now we are back with a new and refreshed bug bounty program! The curl bug bounty program reborn.

This new program, which hopefully will manage to survive a while, is setup in cooperation with the major bug bounty player out there: hackerone.

Basic rules

If you find or suspect a security related issue in curl or libcurl, report it! (and don’t speak about it in public at all until an agreed future date.)

You’re entitled to ask for a bounty for all and every valid and confirmed security problem that wasn’t already reported and that exists in the latest public release.

The curl security team will then assess the report and the problem and will then reward money depending on bug severity and other details.

Where does the money come from?

We intend to use funds and money from wherever we can. The Hackerone Internet Bug Bounty program helps us, donations collected over at opencollective will be used as well as dedicated company sponsorships.

We will of course also greatly appreciate any direct sponsorships from companies for this program. You can help curl getting even better by adding funds to the bounty program and help us reward hard-working researchers.

Why bounties at all?

We compete for the security researchers’ time and attention with other projects, both open and proprietary. The projects that can help put food on these researchers’ tables might have a better chance of getting them to use their tools, time, skills and fingers to find our problems instead of someone else’s.

Finding and disclosing security problems can be very time and resource consuming. We want to make it less likely that people give up their attempts before they find anything. We can help full and part time security engineers sustain their livelihood by paying for the fruits of their labor. At least a little bit.

Only released code?

The state of the code repository in git is not subject for bounties. We need to allow developers to do mistakes and to experiment a little in the git repository, while we expect and want every actual public release to be free from security vulnerabilities.

So yes, the obvious downside with this is that someone could spot an issue in git and decide not to report it since it doesn’t give any money and hope that the flaw will linger around and ship in the release – and then reported it and claim reward money. I think we just have to trust that this will not be a standard practice and if we in fact notice that someone tries to exploit the bounty in this manner, we can consider counter-measures then.

How about money for the patches?

There’s of course always a discussion as to why we should pay anyone for bugs and then why just pay for reported security problems and not for heroes who authored the code in the first place and neither for the good people who write the patches to fix the reported issues. Those are valid questions and we would of course rather pay every contributor a lot of money, but we don’t have the funds for that. And getting funding for this kind of dedicated bug bounties seem to be doable, where as a generic pay contributors fund is trickier both to attract money but it is also really hard to distribute in an open project of curl’s nature.

How much money?

At the start of this program the award amounts are as following. We reward up to this amount of money for vulnerabilities of the following security levels:

Critical: 2,000 USD
High: 1,500 USD
Medium: 1,000 USD
Low: 500 USD

Depending on how things go, how fast we drain the fund and how much companies help us refill, the amounts may change over time.

Found a security flaw?

Report it!

https://daniel.haxx.se/blog/2019/04/22/curl-hackerone-true/

êîììåíòàðèè: 0 ïîíðàâèëîñü! ââåðõ^ ê ïîëíîé âåðñèè